Annotation of large multilingual corpora remains a challenge to the data-driven approach to speech research, especially for under-resourced languages. This paper presents crosslan...
In this paper, we use information retrieval (IR) techniques to improve a speech recognition (ASR) system. The potential benefits include improved speed, accuracy, and scalability...
The major contribution of this paper is the presentation of a general unifying description of distributed algorithms allowing to map local, node-based, algorithms onto a single gl...
We propose an ℓ1 criterion for dictionary learning for sparse signal representation. Instead of directly searching for the dictionary vectors, our dictionary learning approach i...
Detecting the time of occurrence of an acoustic event (for instance, a cheer) embedded in a longer soundtrack is useful and important for applications such as search and retrieval...
Keansub Lee, Daniel P. W. Ellis, Alexander C. Loui
We propose a novel algorithm for sparse system identification in the frequency domain. Key to our result is the observation that the Fourier transform of the sparse impulse respo...
We propose a new framework for speaker recognition, referred as Fishervoice. It includes the design of a feature representation known as the structured score vector (SSV), which r...
This paper proposes a method to estimate the parameters of the relative phase probability density function (RP pdf) of the complex coefficients when the image is corrupted by add...
We present a novel framework based on hidden Markov models (HMMs) for matching feature point sets, which capture the shapes of object contours of interest. Point matching algorith...
We extensively evaluated a data hiding algorithm for stereo audio signals which embeds data using the polarity of the echoes added to the high-frequency channels, which we have pr...