Inferences from time-series data can be greatly enhanced by taking into account multiple modalities. In some cases, such as audio of speech and the corresponding video of lip gest...
Trausti T. Kristjansson, Brendan J. Frey, Thomas S...
Arvand is a robot specially designed and constructed for playing soccer according to RoboCup rules and regulations for the medium size robots. This robot consists of three main par...
The ability to detect a persons unconstrained hand in a natural video sequence has applications in sign language, gesture recognition and HCI. This paper presents a novel, unsuper...
To leverage large-scale weakly-tagged images for computer vision tasks (such as object detection and scene recognition), a novel cross-modal tag cleansing and junk image filtering...
Invariant feature descriptors such as SIFT and GLOH have been demonstrated to be very robust for image matching and object recognition. However, such descriptors are typically of ...