The paper proposes a diphone/sub-syllable method for Arabic Text-to-speech systems. The proposed approach exploits the particular syllabic structure of the Arabic words. For good ...
This paper introduces a new optimization technique for moving segment labels (phone and subphonetic) to optimize statistical parametric speech synthesis models. The choice of obje...
Model-based methods for sequential organization in cochannel speech require pretrained speaker models and often prior knowledge of participating speakers. We propose an unsupervis...
This paper addresses the problem of automatic detection of salient video segments for real-world applications such as corporate training based on associated speech transcriptions....
This paper examines tagging models for spontaneous English speech transcripts. We analyze the performance of state-of-the-art tagging models, either generative or discriminative, ...
Vladimir Eidelman, Zhongqiang Huang, Mary P. Harpe...