Temporal consistency is ubiquitous in video data, where temporally adjacent video shots usually share similar visual and semantic content. This paper presents a thorough study of temporal consistency defined with respect to semantic concepts and query topics using quantitative measures, and discusses its implications to video analysis and retrieval tasks. We further show that, in interactive settings, using temporal consistency leads to considerable improvement on the performance of semantic concept detection and retrieval of video data. Specifically, an active learning method with temporal sampling strategy is proposed for building classifiers of semantic concepts, and a temporal reranking method is proposed for improving the efficiency of interactive video search. Both methods outperform existing methods by considerable margins on the TRECVID dataset. Categories and Subject Descriptors H.3.1 [Information Storage and Retrieval]: Content Analysis and Indexing—indexing methods Gen...
Jun Yang 0003, Alexander G. Hauptmann