Abstract. In this study the gesture duration and articulator velocity in consonant-vowel-transitions has been analysed using electromagnetic articulography (EMA). The receiver coil...
Dominik Bauer, Jim Kannampuzha, Phil Hoole, Bernd ...
We start from the state-of-the-art Bag of Words pipeline that in the 2008 benchmarks of TRECvid and PASCAL yielded the best performance scores. We have contributed to that pipelin...
Jasper R. R. Uijlings, Arnold W. M. Smeulders, Rem...
Describing shots through the occurrence of semantic concepts is the first step towards modeling the content of a video semantically. An important challenge is to automatically se...
Spatial language video retrieval is an important real-world problem that is also a natural test bed for evaluating semantic structures for natural language descriptions of motion ...
Sequence matching techniques are effective for comparing two videos. However, existing approaches suffer from demanding computational costs and thus are not scalable for large-sca...
Contextual information is vital for the robust extraction of semantic information in automated surveillance systems. We have developed a scene independent framework for the detect...
We present an efficient world-scale system for providing automatic annotation on collections of geo-referenced photos. As a user uploads a photograph a place of origin is estimate...
Jim Kleban, Emily Moxley, Jiejun Xu, B. S. Manjuna...
Uploading tourist photographs is a popular activity on photo sharing platforms. The manual annotation of these images is a tedious process and the users often upload their images ...