This paper presents a framework for data modeling ntic abstraction of image/video data. The framework is based on spatio-temporalinformation associated with salient objects in an image or in a sequence of video frames and on a set of generalized n-ary operators defined to specify spatial and temporal relationships of objects present in the data. The methodology presented in this paper can manifest itself effectively in conceptualizing events and heterogeneous views an multimedia data as perceaved by individual users. The proposed paradigm induces a multilevel indexing and searching mechanism that models information at various levels of granularity and hence allows processing of content-based queries in real time. We also devise a unified object-oriented interface for users with heterogeneous views to specify queries on the unbiased encoded data. Currently this framework is being developed to realize a highly integrated multimedia database architecture. '