We present a universal Parts-of-Speech (POS) tagset framework covering most of the Indian languages (ILs) following the hierarchical and decomposable tagset schema. In spite of si...
A method that exploits an information theoretic framework to extract optimized audio features using video information is presented. A simple measure of mutual information (MI) betw...
Abstract--We describe some high-level approaches to estimating confidence scores for the words output by a speech recognizer. By "high-level" we mean that the proposed me...
When automatic speech recognition (ASR) and speaker verification (SV) are applied in adverse acoustic environments, endpoint detection and energy normalization can be crucial to th...
Classification and regression tree approach was used in this research to model phone duration of Lithuanian. 300 thousand samples of vowels and 400 thousand samples of consonants e...