Speech enhancement using fourth-order cumulants and optimum filters in the subband domain
A new method for speech enhancement using time-domain optimum filters and fourth-order cumulants (FOC) is proposed based on newly established properties of the FOC of speech signals. In the exploratory part of the paper, the analytical expression of the FOC of subbanded speech is derived assuming a sinusoidal model and up to two harmonics per band. Important properties about this cumulant are revealed and actual speech data is used to verify the derivations and the underlying model. In the application part of the work, speech enhancement is formulated as an estimation problem and the expression for the time-domain causal optimum filters is derived for a pth order system. The key idea is to use the FOC of the noisy speech to estimate the parameters required for the enhancement filters, namely the second-order statistics of the speech and noise. It is shown that the kurtosis and the diagonal slice of the FOC may be used to estimate such parameters as the SNR, the speech autocorrelation and the probability of speech presence in a given band. Subjective listening and examination of the spectrograms show that the resulting algorithm is effective on typical noises encountered in mobile telephony. Compared to the TIA-IS127 standard for noise reduction, it results in overall more noise reduction and better speech preservation in Gaussian, street and fan noise. Its effectiveness diminishes however in harmonic and impulsive types such as office and car engine, where discrimination between speech and noise based on FOC becomes more difficult.
|Keywords||Higher order statistics, Noise reduction, Speech enhancement|
Nemer, E. (Elias), Goubran, R, & Mahmoud, S. (Samy). (2002). Speech enhancement using fourth-order cumulants and optimum filters in the subband domain. Speech Communication, 36(3-4), 219–246. doi:10.1016/S0167-6393(00)00081-9