Inferences from time-series data can be greatly enhanced by taking into account multiple modalities. In some cases, such as audio of speech and the corresponding video of lip gest...
Trausti T. Kristjansson, Brendan J. Frey, Thomas S...
The fusion of information from heterogenous sensors is crucial to the effectiveness of a multimodal system. Noise affect the sensors of different modalities independently. A good ...
Shankar T. Shivappa, Bhaskar D. Rao, Mohan M. Triv...
This article introduces automatic speech recognition based on Electro-Magnetic Articulography (EMA). Movements of the tongue, lips, and jaw are tracked by an EMA device, which are...
The method which is called the “tandem approach” in speech recognition has been shown to increase performance by using classifier posterior probabilities as observations in a...
In the present work we study the appropriateness of a number of linear and non-linear regression methods, employed on the task of speech segmentation, for combining multiple phone...