The automatic transcription of broadcast news and meetings involves the segmentation, identification and tracking of speaker turns during each session, which is known as speaker di...
This paper shows (i) improvements over state-of-the-art local feature recognition systems, (ii) how to formulate principled models for automatic local feature selection in object c...
The human voice is primarily a carrier of speech, but it also contains non-linguistic features unique to a speaker and indicative of various speaker demographics, e.g. gender, nat...
Many current state-of-the-art speaker diarization systems exploit agglomerative hierarchical clustering (AHC) as their speaker clustering strategy, due to its simple processing str...
An important task for multiparty meeting understanding is extracting action items. Action items are a set of tasks that are agreed on by the participants for execution after the m...