This work proposes a model for video retrieval based upon the inference network model. The document network is constructed using video metadata encoded using MPEG-7 and captures information pertaining to the structural aspects (video breakdown into shots and scenes), conceptual aspects (video, scene and shot content) and contextual aspects (context information about the position of conceptual content within the document). The retrieval process a) exploits the distribution of evidence among the shots to perform ranking of different levels of granularity, b) addresses the idea that evidence may be inherited during evaluation, and c) exploits the contextual information to perform constrained queries. Keywords Structured Video Retrieval, MPEG-7, Inference network, Combination of evidence Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval--Retrieval models; H.3.7 [Information Storage and Retrieval]: Digital Libraries--Standards Ge...