Conventional operators for data retrieval are either based on exact matching or on total order relationship among elements. Neither of them is appropriate to manage complex data, such as multimedia data, time series and genetic sequences. In fact, the most meaningful way to compare complex data is by similarity. However, the Relational Algebra, employed in the Relational Database Management Systems (RDBMS), cannot express similarity criteria. In order to address this issue, we provide here an extension of the Relational Algebra, aimed at representing similarity queries in algebraic expressions. This paper identifies fundamental properties to allow the integration of the unary similarity operators into the Relational Algebra to handle similarity-based operators, either alone or combined with the existing (exact matching and/or relational) operators. We also show how to take advantage of such properties to optimize similarity queries, including these properties into a similarity query op...
Mônica Ribeiro Porto Ferreira, Agma J. M. Tr