Sciweavers

ICASSP
2011
IEEE
13 years 8 days ago
Training of error-corrective model for ASR without using audio data
This paper introduces a method to train an error-corrective model for Automatic Speech Recognition (ASR) without using audio data. In existing techniques, it is assumed that suf...
Gakuto Kurata, Nobuyasu Itoh, Masafumi Nishimura
ICASSP
2011
IEEE
13 years 8 days ago
Support vector regression fusion scheme in phone duration modeling
A fusion scheme of phone duration models (PDMs) is presented in this work. Specifically, a support vector regression (SVR)-fusion model is fed with the predictions of a group of i...
Alexandros Lazaridis, Iosif Mporas, Todor Ganchev,...
ICASSP
2011
IEEE
13 years 8 days ago
Speech synthesis using HMM based diphone inventory encoding for low-resource devices
In this paper we describe the compression of diphone inventories used by the acoustic synthesis of a concatenative synthesis system. The inventory compression is based on a codebo...
Guntram Strecha, Matthias Wolff
ICASSP
2011
IEEE
13 years 8 days ago
Beating nyquist through correlations: A constrained random demodulator for sampling of sparse bandlimited signals
Technological constraints severely limit the rate at which analog-todigital converters can reliably sample signals. Recently, Tropp et al. proposed an architecture, termed the ran...
Andrew Harms, Waheed U. Bajwa, A. Robert Calderban...
ICASSP
2011
IEEE
13 years 8 days ago
Estimating note intensities in music recordings
In this paper, we present automated methods for estimating note intensities in music recordings. Given a MIDI file (representing the score) and an audio recording (representing a...
Sebastian Ewert, Meinard Müller
ICASSP
2011
IEEE
13 years 8 days ago
Score informed audio source separation using a parametric model of non-negative spectrogram
In this paper we present a new technique for monaural source separation in musical mixtures, which uses the knowledge of the musical score. This information is used to initialize ...
Romain Hennequin, Bertrand David, Roland Badeau
ICASSP
2011
IEEE
13 years 8 days ago
Improving head-related impulse response measured in noisy environments with spatio-temporal frequency analysis
A new noise reduction method based on spatio-temporal frequency analysis is proposed that can be applied to head-related impulse response (HRIR), which is an impulse response betw...
Takanori Nishino, Kazuya Takeda
ICASSP
2011
IEEE
13 years 8 days ago
Source-normalised-and-weighted LDA for robust speaker recognition using i-vectors
The recently developed i-vector framework for speaker recognition has set a new performance standard in the research field. An i-vector is a compact representation of a speaker u...
Mitchell McLaren, David A. van Leeuwen
ICASSP
2011
IEEE
13 years 8 days ago
A supervised approach to movie emotion tracking
In this paper, we present experiments on continuous time, continuous scale affective movie content recognition (emotion tracking). A major obstacle for emotion research has been t...
Nikos Malandrakis, Alexandros Potamianos, Georgios...
ICASSP
2011
IEEE
13 years 8 days ago
An efficient variable step-size proportionate affine projection algorithm
Proportionate-type affine projection algorithms (PAPAs) are very attractive choices for echo cancellation. These algorithms combine the good features (convergence and tracking) o...
Constantin Paleologu, Jacob Benesty, Felix Albu, S...