We present here a prototype video analysis and retrieval system, called NeTra-V, that is being developed to build an object-based video representation for functionalities such as search and retrieval of video objects. A region-based content description scheme using low-level visual descriptors is proposed. In order to obtain regions for local feature extraction, a new spatio-temporal segmentation and region-tracking scheme is employed. The segmentation algorithm uses all three visual features: color, texture, and motion in the video data. A group processing scheme similar to the one in the MPEG-2 standard is used to ensure the robustness of the segmentation. The proposed approach can handle complex scenes with large motion. After segmentation, regions are tracked through the video sequence using extracted local features. The results of tracking are sequences of coherent regions, called "subobjects." Subobjects are the fundamental elements in our low-level content description ...
Yining Deng, Debargha Mukherjee, B. S. Manjunath