This paper explores the use of fast, simple computer vision techniques to add compelling visual capabilities to social user interfaces. Social interfaces involve the user in natur...
We present an approach for tracking a lecturer during the course of his speech. We use features from multiple cameras and microphones, and process them in a joint particle filter f...
Kai Nickel, Tobias Gehrig, Hazim Kemal Ekenel, Joh...
We present a fast query-based multi-document summarizer called FastSum based solely on word-frequency features of clusters, documents and topics. Summary sentences are ranked by a...
We present a maximally streamlined approach to learning HMM-based acoustic models for automatic speech recognition. In our approach, an initial monophone HMM is iteratively refin...
Predicting possible code-switching points can help develop more accurate methods for automatically processing mixed-language text, such as multilingual language models for speech ...