In this work, we propose new graph-based data model and indexing to organize and manage video data. To consider spatial and temporal characteristics of video, we introduce a new graph-based data model called SpatioTemporal Region Graph (STRG). Unlike existing graphbased data structures which provide only spatial features, the proposed STRG further provides temporal features, which represent temporal relationships among spatial objects. The STRG is decomposed into its subgraphs object graphs (OGs) and background graphs (BGs). In addition, a new distance measure, called Extended Graph Edit Distance (EGED), is introduced in metric space for matching and indexing. Based on clustering and EGED, we propose a new indexing method STRG-Index, which is faster and more accurate. We compare the STRG-Index with the M-tree, which is a popular tree-based indexing method for multimedia data. The STRG-Index outperforms the M-tree in terms of cost and speed.