A discrete cosine transform (DCT) domain speech enhancement algorithm is proposed that models the evolution of speech DCT coefficients as a time-varying autoregressive process. Rao-Blackwellized particle filter (RBPF) techniques are used to estimate the model parameters and recover the clean signal coefficients. Using very low-order models for each coefficient and operating at a decimated frame rate, the proposed approach provides a significant complexity reduction compared to the standard full-band RBPF speech enhancement algorithm. In addition to the complexity gains, performance is also improved. Modeling the speech signal in the DCT-domain is shown to provide a better fit in spectral troughs, leading to more noise reduction and less speech distortion. To illustrate possible frequency-dependent processing strategies, a hybrid structure is proposed that offers a complexity/performance trade-off by substituting a simple DCT Wiener filter for the DCT-RBPF in some bands. In comparisons with high performing speech enhancement algorithms using wideband speech and noise, the proposed DCT-RBPF algorithm achieves higher scores on objective quality and intelligibility measures.

Additional Metadata
Keywords Discrete cosine transform (DCT), Noise reduction, Particle filtering, Speech enhancement
Persistent URL dx.doi.org/10.1016/j.specom.2010.05.005
Journal Speech Communication
Citation
Laska, B. (Brady), Bolić, M. (Miodrag), & Goubran, R. (2010). Discrete cosine transform particle filter speech enhancement. Speech Communication, 52(9), 762–775. doi:10.1016/j.specom.2010.05.005