Convolution tree kernel has shown promising results in semantic role labeling (SRL). However, this kernel does not consider much linguistic knowledge in kernel design and only perf...
Min Zhang, Wanxiang Che, Guodong Zhou, AiTi Aw, Ch...
Abstract-- Traditional methods of spoken utterance classification (SUC) adopt two independently trained phases. In the first phase, an automatic speech recognition (ASR) module ret...
Sibel Yaman, Li Deng, Dong Yu, Ye-Yi Wang, Alex Ac...
The parallel phone recognition followed by language model (PPRLM) architecture represents one of the state-of-the-art spoken language identification systems. A PPRLM system compris...
In this paper, we optimize and evaluate computational models of similarity for sounds from the same instrument class. We investigate four instrument classes: bass drums, snare drum...
Epoch is the instant of significant excitation of the vocal-tract system during production of speech. For most voiced speech, the most significant excitation takes place around the...
In this paper, we introduce cross-multiplicative transfer function (CMTF) approximation for modeling linear systems in the short-time Fourier transform (STFT) domain. We assume tha...
The new model reduces the impact of local spectral and temporal variability by estimating a finite set of spectral and temporal warping factors which are applied to speech at the f...
Antonio Miguel, Eduardo Lleida, Richard Rose, Luis...
Source separation of musical signals is an appealing but difficult problem, especially in the single-channel case. In this paper, an unsupervised single-channel music source separa...
In this paper, we investigate the noise robustness of Wang and Shamma's early auditory (EA) model for the calculation of an auditory spectrum in audio classification applicati...