This paper surveys previous work on combining planning techniques with expressive representations of knowledge in description logics to reason about tasks, plans, and goals. Descr...
In this paper we propose a system that annotates a user generated video based on the associated location metadata, by exploiting user-tagged image databases. An example of such a ...
Large vocabulary automatic speech recognition (ASR) technologies perform well in known and controlled contexts. In less controlled conditions, however, human review is often neces...
We are developing a testbed for learning by demonstration combining spoken language and sensor data in a natural real-world environment. Microsoft Kinect RGBDepth cameras allow us...
We describe a method to align ASL video subtitles with a closed-caption transcript. Our alignments are partial, based on spotting words within the video sequence, which consists o...