Sciweavers

ICASSP
2008
IEEE
14 years 2 months ago
Adaptive short-time analysis-synthesis for speech enhancement
In this paper we present a new adaptive short-time Fourier analysissynthesis scheme and demonstrate its efficacy in speech enhancement. While a number of adaptive analyses have p...
Daniel Rudoy, Prabahan Basu, Thomas F. Quatieri, B...
ICASSP
2008
IEEE
14 years 2 months ago
Parsing-based objective functions for speech recognition in translation applications
This paper looks at a parsing-based alternative to word error rate (WER) for optimizing recognition, SParseval, hypothesizing that it may be a better objective for applications su...
Dustin Hillard, Mei-Yuh Hwang, Mary P. Harper, Mar...
ICASSP
2008
IEEE
14 years 2 months ago
Exploiting temporal change of pitch in formant estimation
This paper considers the problem of obtaining an accurate spectral representation of speech formant structure when the voicing source exhibits a high fundamental frequency. Our wo...
Tao T. Wang, Thomas F. Quatieri
ICASSP
2008
IEEE
14 years 2 months ago
BIC-based audio segmentation by divide-and-conquer
Audio segmentation has received increasing attention in recent years for its potential applications in automatic indexing and transcription of audio data. Among existing audio seg...
Shih-Sian Cheng, Hsin-Min Wang, Hsin-Chia Fu
ICASSP
2008
IEEE
14 years 2 months ago
Discriminative training by iterative linear programming optimization
In this paper, we cast discriminative training problems into standard linear programming (LP) optimization. Besides being convex and having globally optimal solution(s), LP progra...
Brian Mak, Benny Ng
ICASSP
2008
IEEE
14 years 2 months ago
Quality evaluation of the G.EV-VBR speech codec
ITU-T has selected the candidate submitted by Ericsson, Nokia, Motorola, VoiceAge, and Texas Instruments as the baseline for the G.EV-VBR coding standard. G.EV-VBR is an embedded ...
Anssi Rämö, Henri Toukomaa, S. Craig Gre...
ICASSP
2008
IEEE
14 years 2 months ago
Toward a detector-based universal phone recognizer
In recent research, we have proposed a high-accuracy bottom-up detection-based paradigm for continuous phone speech recognition. The key component of our system was a bank of arti...
Sabato Marco Siniscalchi, Torbjørn Svendsen...
ICASSP
2008
IEEE
14 years 2 months ago
Modified polyphone decision tree specialization for porting multilingual Grapheme based ASR systems to new languages
Automatic speech recognition (ASR) systems have been developed only for a very limited number of the estimated 7,000 languages in the world. In order to avoid the evolvement of a ...
Sebastian Stüker
ICASSP
2008
IEEE
14 years 2 months ago
Fine-grained pitch accent and boundary tone labeling with parametric F0 features
Motivated by linguistic theories of prosodic categoricity, symbolic representations of prosody have recently attracted the attention of speech technologists. Categorical represent...
Sankaranarayanan Ananthakrishnan, Shrikanth Naraya...