Prosody is an important cue for identifying dialog acts. In this paper, we show that modeling the sequence of acousticprosodic values as n-gram features with a maximum entropy mod...
With the development of voice transformation and speech synthesis technologies, speaker identification systems are likely to face attacks from imposters who use voice transformed ...
Qin Jin, Arthur R. Toth, Alan W. Black, Tanja Schu...
This paper shows how to improve Hidden Conditional Random Fields (HCRFs) for phone classification by applying various speaker adaptation techniques. These include Maximum A Poste...
Yun-Hsuan Sung, Constantinos Boulis, Daniel Jurafs...
In this research, an iterative and unsupervised Turbo-style algorithm is presented and implemented for the task of automatic lexical acquisition. The algorithm makes use of spoken...
Ghinwa F. Choueiter, Mesrob I. Ohannessian, Stepha...
We present a framework for speech recognition that accounts for hidden articulatory information. We model the articulatory space using a codebook of articulatory configurations g...
Language model (LM) adaptation is often achieved by combining a generic LM with a topic-specific model that is more relevant to the target document. Unlike previous work on unsup...
Networked embedded acoustic sensors and imagers allow scientists to observe biological and environmental phenomena at high sampling rates and multiple scales. Such sampling can cr...
Michael Allen, Eric Graham, Shaun Ahmadian, Tetsun...
A top-down task-dependent model guides attention to likely target locations in cluttered scenes. Here, a novel biologically plausible top-down auditory attention model is presente...
This paper describes a simple method for significantly improving Tandem features used to train acoustic models for large-vocabulary speech recognition. The linear activations at ...
A knowledge representation formalism for SLU is introduced. It is used for incremental and partially automated annotation of the MEDIA corpus in terms of semantic structures. An a...