This paper investigates the effects of temporal clipping on perceived speech quality. Temporal clipping usually results from voice activity detection (VAD), or line echo canceller's nonlinear processor, and the clipped speech portions are replaced by comfort noise. A nonintrusive algorithm is proposed to predict speech quality based on the clipping statistics. Mean opinion score (MOS) is used as a metric for speech quality and is measured by perceptual evaluation of speech quality (PESQ). The impacts of speech frame size and noise spectrum on the algorithm are also investigated. The results show that the proposed algorithm can efficiently predict the speech quality. The correlation coefficient between the prediction and the measurement is about 0.975, and the root mean square error for the prediction is 0.20 MOS. The algorithm can be used as an integral part of a general speech quality assessment scheme in voice over Internet protocol (VoIP).

Additional Metadata
Keywords Mean opinion score (MOS), Nonintrusive method, Quality of experience (QoE), Speech quality, Temporal clipping, Voice activity detection (VAD), Voice over Internet protocol (VoIP)
Persistent URL
Journal IEEE Transactions on Instrumentation and Measurement
Ding, L. (Lijing), Radwan, A. (Ayman), El-Hennawey, M.S. (Mohamed Samy), & Goubran, R. (2006). Measurement of the effects of temporal clipping on speech quality. IEEE Transactions on Instrumentation and Measurement, 55(4), 1197–1203. doi:10.1109/TIM.2006.876538