Many documentary videos use background music to help structure the content and communicate the semantic. In this paper, we investigate semantic segmentation of documentary video u...
Abstract. In the last decade speech processing has been applied in commercially available products. One of the key reasons for its success is the identification and use of an unde...
In the paper we present two techniques improving the recognition accuracy of multilayer perceptron neural networks (MLP ANN) by means of adopting Speaker Adaptive Training. The us...
Spoken utterance retrieval was largely studied in the last decades, with the purpose of indexing large audio databases or of detecting keywords in continuous speech streams. While...
Mickael Rouvier, Georges Linares, Benjamin Lecoute...
Due to upcoming mobile telephony services with higher speech quality, a wideband (50 Hz to 7 kHz) mobile telephony derivative of TIMIT has been recorded called WTIMIT. It allows a...