A method is described for estimating the fundamental frequencies of several concurrent sounds in polyphonic music and multiple-speaker speech signals. The method consists of a comp...
Untethered multimodal interfaces are more attractive than tethered ones because they are more natural and expressive for interaction. Such interfaces usually require robust vision...
Despite their effectiveness for robust speech processing, missing data techniques are vulnerable to errors in the classification of the input speech signal’s time-frequency poi...
The new model reduces the impact of local spectral and temporal variability by estimating a finite set of spectral and temporal warping factors which are applied to speech at the f...
Antonio Miguel, Eduardo Lleida, Richard Rose, Luis...
Audio segmentation is an essential preprocessing step in several audio processing applications with a significant impact e.g. on speech recognition performance. We introduce a no...