Data partitioning is often used to scale-up a database system. In a centralized database system, the serialization order of commited update transactions can be inferred from the d...
Until recently, most data integration techniques involved central components, e.g., global schemas, to enable transparent access to heterogeneous databases. Today, however, with t...
Emerging applications in sensor systems and network-wide IP traffic analysis present many technical challenges. They need distributed monitoring and continuous tracking of events....
Wireless sensor networks are proving to be useful in a variety of settings. A core challenge in these networks is to minimize energy consumption. Prior database research has propo...
David Chu, Amol Deshpande, Joseph M. Hellerstein, ...
In this demo we present the cgmOLAP server, the first fully functional parallel OLAP system able to build data cubes at a rate of more than 1 Terabyte per hour. cgmOLAP incorporat...
Ying Chen, Andrew Rau-Chaplin, Frank K. H. A. Dehn...
We consider the problem of speeding up Entity Recognition systems that exploit existing large databases of structured entities to improve extraction accuracy. These systems requir...
Data cleaning based on similarities involves identification of "close" tuples, where closeness is evaluated using a variety of similarity functions chosen to suit the do...
Graphs have become popular for modeling structured data. As a result, graph queries are becoming common and graph indexing has come to play an essential role in query processing. ...
Incorporating the skyline operator inside the relational engine requires solving the cardinality estimation and the cost estimation problem, hitherto unaddressed. We propose robus...
Surajit Chaudhuri, Nilesh N. Dalvi, Raghav Kaushik
Streaming XPath evaluation algorithms must record a potentially exponential number of pattern matches when both predicates and descendant axes are present in queries, and the XML ...