The major problem in building a good lipreading system is to extract effective visual features from enormous quantity of video sequences data. For appearance-based feature analysi...
Yun Fu, Xi Zhou, Ming Liu, Mark Hasegawa-Johnson, ...
In this paper we study the connection between sentiment of images expressed in metadata and their visual content in the social photo sharing environment Flickr. To this end, we co...
Stefan Siersdorfer, Enrico Minack, Fan Deng, Jonat...
—In this work we propose a dynamic-texture-based approach to the recognition of facial Action Units (AUs, atomic facial gestures) and their temporal models (i.e., sequences of te...
News videos from different channels, languages are broadcast everyday, which provide abundant information for users. To effectively search, retrieve, browse and track news stories...
In this article we define a multimedia content analysis problem, which we call multimodal location estimation: Given a video/image/audio file, the task is to determine where it wa...