Phoneme segmentation is a fundamental problem in many speech recognition and synthesis studies. Unsupervised phoneme segmentation assumes no knowledge on linguistic contents and a...
Understanding multi-party meetings involves tasks such as dialog act segmentation and tagging, action item extraction, and summarization. In this paper we introduce a new task for...
In this paper, we describe a novel statistical approach to the vocal tract transfer function (VTTF) estimation of a speech signal based on a factor analyzed trajectory hidden Mark...
This paper presents a modulation-based reconstruction method for audio signals across long gaps of missing samples. We use LTI filterbanks followed by a multiplicative model that...
We present a technique for denoising speech using nonnegative matrix factorization (NMF) in combination with statistical speech and noise models. We compare our new technique to s...
Kevin W. Wilson, Bhiksha Raj, Paris Smaragdis, Aja...
Robust Spoken Language Understanding (SLU) is a key component of spoken dialogue systems. Recent statistical approaches to this problem require additional resources (e.g. gazettee...
Prosody is an important cue for identifying dialog acts. In this paper, we show that modeling the sequence of acousticprosodic values as n-gram features with a maximum entropy mod...
With the development of voice transformation and speech synthesis technologies, speaker identification systems are likely to face attacks from imposters who use voice transformed ...
Qin Jin, Arthur R. Toth, Alan W. Black, Tanja Schu...
This paper shows how to improve Hidden Conditional Random Fields (HCRFs) for phone classification by applying various speaker adaptation techniques. These include Maximum A Poste...
Yun-Hsuan Sung, Constantinos Boulis, Daniel Jurafs...
In this research, an iterative and unsupervised Turbo-style algorithm is presented and implemented for the task of automatic lexical acquisition. The algorithm makes use of spoken...
Ghinwa F. Choueiter, Mesrob I. Ohannessian, Stepha...