We propose an algorithm that enables joint Viterbi decoding of multiple independent audio recordings of a word to derive its pronunciation. Experiments show that this method resul...
There has been much recent progress in the technical infrastructure necessary to continuously characterize and archive all sounds, or more precisely auditory streams, that occur w...
Jiachen Xue, Gordon Wichern, Harvey D. Thornburg, ...
In this paper we describe a technique that allows the extraction of multiple local shift-invariant features from analysis of non-negative data of arbitrary dimensionality. Our app...
Paris Smaragdis, Bhiksha Raj, Madhusudana V. S. Sh...
We present a distance measure between audio files designed to identify cover songs, which are new renditions of previously recorded songs. For each song we compute the chromagram...
This paper describes experiments in automatic recognition of context-independent phoneme strings from meeting data using audiovisual features. Visual features are known to improve ...