This paper introduces a method to train an error-corrective model for Automatic Speech Recognition (ASR) without using audio data. In existing techniques, it is assumed that suf...
A fusion scheme of phone duration models (PDMs) is presented in this work. Specifically, a support vector regression (SVR)-fusion model is fed with the predictions of a group of i...
In this paper we describe the compression of diphone inventories used by the acoustic synthesis of a concatenative synthesis system. The inventory compression is based on a codebo...
Technological constraints severely limit the rate at which analog-todigital converters can reliably sample signals. Recently, Tropp et al. proposed an architecture, termed the ran...
Andrew Harms, Waheed U. Bajwa, A. Robert Calderban...
In this paper, we present automated methods for estimating note intensities in music recordings. Given a MIDI file (representing the score) and an audio recording (representing a...
In this paper we present a new technique for monaural source separation in musical mixtures, which uses the knowledge of the musical score. This information is used to initialize ...
A new noise reduction method based on spatio-temporal frequency analysis is proposed that can be applied to head-related impulse response (HRIR), which is an impulse response betw...
The recently developed i-vector framework for speaker recognition has set a new performance standard in the research field. An i-vector is a compact representation of a speaker u...
In this paper, we present experiments on continuous time, continuous scale affective movie content recognition (emotion tracking). A major obstacle for emotion research has been t...
Proportionate-type affine projection algorithms (PAPAs) are very attractive choices for echo cancellation. These algorithms combine the good features (convergence and tracking) o...
Constantin Paleologu, Jacob Benesty, Felix Albu, S...