To support the retrieval and fusion of multimedia information from multiple sources and databases, a spatial/temporal query language called ΣQL is proposed. ΣQL is based upon the σ−operator sequence and in practice expressible in SQL-like syntax. ΣQL allows a user to specify powerful spatial/temporal queries for both multimedia data sources and multimedia databases, eliminating the need to write different queries for each. A ΣQL query can be processed in the most effective manner by first selecting the suitable transformations of multimedia data to derive the multimedia static schema, and then processing the query with respect to this multimedia static schema. In this paper we illustrate this approach by data fusion examples, investigate multimedia data transformations and provide query processing algorithms.