Relational Database Systems often support activities like data warehousing, cleaning and integration. All these activities require performing some sort of data transformations. Since data often resides on relational databases, data transformations are often specified using SQL, which is based on relational algebra. However, many useful data transformations cannot be expressed as SQL queries due to the limited expressive power of relational algebra. In particular, an important class of data transformations that produces several output tuples for a single input tuple cannot be expressed in that way. In this paper, we analyze alternatives to process one-to-many data transformations using Relational Database Management Systems, and compare them in terms of expressiveness, optimizability and performance.
Paulo J. F. Carreira, Helena Galhardas, Joã