Training accurate acoustic models typically requires a large amount of transcribed data, which can be expensive to obtain. In this paper, we describe a novel semi-supervised learn...
Balakrishnan Varadarajan, Dong Yu, Li Deng, Alex A...
In recent years, the field of automatic speaker identification has begun to exploit high-level sources of speaker-discriminative information, in addition to traditional models o...
Defining suitable features for environmental sounds is an important problem in an automatic acoustic scene recognition system. As with most pattern recognition problems, extracti...
This paper presents a method for automatic recognition of human gestures. The method works with 3D image data from a range camera to achieve invariance to viewpoint. The recogniti...
Data-driven Spoken Language Understanding (SLU) systems need semantically annotated data which are expensive, time consuming and prone to human errors. Active learning has been su...