In this paper, we propose a novel scene categorization method based on contextual visual words. In this method, we extend the traditional ‘bags of visual words’ model by introd...
Spoken dialogue is notoriously hard to process with standard language processing technologies. Dialogue systems must indeed meet two major challenges. First, natural spoken dialogu...
Speaker clustering is the task of grouping a set of speech utterances into speaker-specific classes. The basic techniques for solving this task are similar to those used for spea...
In this paper two aspects of generating and using phonetic Arabic dictionaries are described. First, the use of single pronunciation acoustic models in the context of Arabic large...
Frank Diehl, Mark J. F. Gales, Marcus Tomalin, Phi...
Phoneme segmentation is a fundamental problem in many speech recognition and synthesis studies. Unsupervised phoneme segmentation assumes no knowledge on linguistic contents and a...