A wealth of information is available on the Web. But often, such data are hidden behind form interfaces which allow only a restrictive set of queries over the underlying databases, greatly hindering data exploration. The ability to materialize these databases has endless applications, from allowing the data to be effectively mined to providing better response times in Web information integration systems. However, reconstructing database images through restricted interfaces can be a daunting task, and sometimes infeasible due to network traffic and high latencies from Web servers. In this paper we introduce the problem of generating efficient query covers, i.e., given a restricted query interface, how to efficiently reconstruct a complete image of the underlying database. We propose a solution to the problem of finding covers for spatial queries over databases accessible through nearestneighbor interfaces. Our algorithm guarantees complete coverage and leads to speedups of over 50 when...
Cláudio T. Silva, Juliana Freire, Simon Bye