Webcams, microphones, pressure gauges and other sensors provide exciting new opportunities for querying and monitoring the physical world. In this paper we focus on querying wide area sensor databases, containing (XML) data derived from sensors spread over tens to thousands of miles. We present the first scalable system for executing XPATH queries on such databases. The system maintains the logical view of the data as a single XML document, while physically the data is fragmented across any number of host nodes. For scalability, sensor data is stored close to the sensors, but can be cached elsewhere as dictated by the queries (auto-tuning). Our design enables self-starting distributed queries that jump directly to the lowest common ancestor of the query result, dramatically reducing query response times. We present a novel query-evaluategather technique (using XSLT) for detecting (1) which data in a local database fragment is part of the query result, and (2) how to gather the missing...
Amol Deshpande, Suman Kumar Nath, Phillip B. Gibbo