The number of video clips available online is growing at a tremendous pace. Conventionally, user-supplied metadata text, such as the title of the video and a set of keywords, has ...
Mehmet Emre Sargin, Hrishikesh Aradhye, Pedro J. M...
The goal of this research is to infer traits about groups of people from their turn-taking behavior in natural conversation. These traits are latent attributes in a social network...
The following article presents a novel, adaptive initialization scheme that can be applied to most state-of-the-art Speaker Diarization algorithms, i.e. algorithms that use agglom...
In a large-scale language detection task, performance variation found between different component systems and different target languages has an adverse effect to the pooled error ...
Raymond W. M. Ng, Cheung-Chi Leung, Tan Lee, Bin M...