Statistical machine translation (SMT) systems for spoken languages suffer from conversational speech phenomena, in particular, the presence of speech dis uencies. We examine the i...
Gaussian mixture models (GMMs) are a convenient and essential tool for the estimation of probability density functions. Although GMMs are used in many research domains from image ...
In this paper we present a novel scheme for unstructured audio scene classification that possesses three highly desirable and powerful features: autonomy, scalability, and robust...
Julian Ramos, Sajid M. Siddiqi, Artur Dubrawski, G...
Efficient design of wireless networks requires implementation of cross-layer algorithms that exploit channel state information. Capitalizing on convex optimization and stochastic...
Distance-Based Amplitude Panning (DBAP) has recently been proposed as a new technique for panning sound sources in two and three dimensional spaces spaces. In this paper, DBAP is ...
Dimitar Kostadinov, Joshua D. Reiss, Valeri Mladen...
We propose a new method to characterize a speaker within the Joint Factor Analysis (JFA) framework. Scoring within the JFA framework can be costly and a new method was proposed to...
The following article presents a novel, adaptive initialization scheme that can be applied to most state-of-the-art Speaker Diarization algorithms, i.e. algorithms that use agglom...
This paper proposes a novel algorithm for minimizing the perceptual distortion in non-negative matrix factorization (NMF) based audio representation. We formulate the noise-to-mas...
In many half-duplex cooperative systems, the direct formulation of the problem of finding the jointly optimal power and channel resource allocation that maximizes a weighted sum ...
In the present paper we present a new approach to the synthesis of filled pauses. The problem is tackled from the point of view of disfluent speech synthesis. Based on the synth...
Jordi Adell, Antonio Bonafonte, David Escudero Man...