This work introduces a robot driven camera controlled by speech. The SIMIS database of 20 recordings of real life surgical operations serves as basis for analyses and noise modell...
The method which is called the “tandem approach” in speech recognition has been shown to increase performance by using classifier posterior probabilities as observations in a...
Emotion recognition grows to an important factor in future media retrieval and man machine interfaces. However, even human deciders often experience problems realizing one’s emo...
This paper describes a new approach to modeling duration for LVCSR using SCARF, a toolkit for speech recognition with segmental conditional random fields. We utilize SCARF’s abi...
This paper describes a simple method for significantly improving Tandem features used to train acoustic models for large-vocabulary speech recognition. The linear activations at ...