Low level features of multimedia content often have limited power to discriminate a document’s relevance to a query. This motivated researchers to investigate other types of features. In this paper, we investigated four groups of features: low-level object features, behavioural features, vocabulary features, and window-based vocabulary features, to predict the relevance of shots in video retrieval. Search logs from two user studies formed the basis of our evaluation. The experimental results show that the window-based vocabulary features performed best. The behavioural features also showed a promising result, which is useful when the vocabulary features are not available. We also discuss the performance of classifiers.
Pablo Bermejo, Hideo Joho, Joemon M. Jose, Robert