Semantic integration in the hidden Web is an emerging area of research where traditional assumptions do not always hold. Frequent changes, conflicts and the sheer size of the hidden Web demand vastly different integration techniques that rely on autonomous detection and heterogeneity resolution, correspondence establishment, and information extraction strategies. In this paper, we present an algebraic language, called Integra, as a foundation for another SQL-like query language called BioFlow, for the integration of Life Sciences data on the hidden Web. The algebra presented here adopts the view that the web forms can be treated as user defined functions and the response they generate from the back end databases can be considered as traditional relations or tables. These assumptions allow us to extend the traditional relational algebra to include integration primitives such as schema matching, wrappers, form submission, and object identification as a family of database functions. T...
Shazzad Hosain, Hasan M. Jamil