Voice activity detection using harmonic frequency components in likelihood ratio test

14 years 7 months ago

Download www.ee.ucla.edu

This paper proposes a new statistical model-based likelihood ratio test (LRT) VAD to obtain reliable speech / non-speech decisions. In the proposed method, the likelihood ratio (LR) is calculated differently for voiced frames, as opposed to unvoiced frames: only DFT bins containing harmonic spectral peaks are selected for LR computation. To evaluate the new VAD’s effectiveness in improving the noiserobustness of ASR, its decisions are applied to preprocessing techniques such as non-linear spectral subtraction, minimum mean square error short-time spectral amplitude estimator, and frame dropping. From the ASR experiments conducted on the Aurora2 database, the proposed harmonic frequency-based LRTs give better results than conventional LRT-based VADs and the standard G.729B and ETSI AMR VADs.

Lee Ngee Tan, Bengt J. Borgstrom, Abeer Alwan

Real-time Traffic

Harmonic Spectral Peaks | ICASSP 2010 | Likelihood Ratio | Model-based Likelihood Ratio | Signal Processing |

claim paper

Post Info
More Details (n/a)

Added	06 Dec 2010
Updated	06 Dec 2010
Type	Conference
Year	2010
Where	ICASSP
Authors	Lee Ngee Tan, Bengt J. Borgstrom, Abeer Alwan

Comments (0)

Sciweavers

Voice activity detection using harmonic frequency components in likelihood ratio test

Harmonic Spectral Peaks | ICASSP 2010 | Likelihood Ratio | Model-based Likelihood Ratio | Signal Processing |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers