Semantic video indexing is the first step towards automatic video retrieval and personalization. We propose a data-driven stochastic modeling approach to perform both video segmentation and video indexing in a single pass. Compared with the existing Hidden Markov Model (HMM)-based video segmentation and indexing techniques, the advantages of the proposed approach are as follows: (1) the probabilistic grammar defining the video program is generated entirely from the training data allowing the proposed approach to handle various kinds of videos without having to manually redefine the program model; (2) the proposed use of the Tamura features improves the accuracy of temporal segmentation and indexing; (3) the need to use an HMM to model the video edit effects is obviated thus simplifying the processing and collection of training data and ensuring that all video segments in the database are labeled with concepts that have clear semantic meanings in order to facilitate semantics-based vid...
Yong Wei, Suchendra M. Bhandarkar, Kang Li