In this demo we present the cgmOLAP server, the first fully functional parallel OLAP system able to build data cubes at a rate of more than 1 Terabyte per hour. cgmOLAP incorporat...
Ying Chen, Andrew Rau-Chaplin, Frank K. H. A. Dehn...
We consider the problem of speeding up Entity Recognition systems that exploit existing large databases of structured entities to improve extraction accuracy. These systems requir...
Data cleaning based on similarities involves identification of "close" tuples, where closeness is evaluated using a variety of similarity functions chosen to suit the do...
Graphs have become popular for modeling structured data. As a result, graph queries are becoming common and graph indexing has come to play an essential role in query processing. ...
Incorporating the skyline operator inside the relational engine requires solving the cardinality estimation and the cost estimation problem, hitherto unaddressed. We propose robus...
Surajit Chaudhuri, Nilesh N. Dalvi, Raghav Kaushik
Streaming XPath evaluation algorithms must record a potentially exponential number of pattern matches when both predicates and descendant axes are present in queries, and the XML ...
In this paper, we introduce a new class of data mining problems called learning from aggregate views. In contrast to the traditional problem of learning from a single table of tra...
Bee-Chung Chen, Lei Chen 0003, Raghu Ramakrishnan,...
Adaptivity is a challenging open issue in data stream management. In this paper, we tackle the problem of memory adaptivity inside a system executing temporal sliding window queri...
In order to support continuous queries over data streams, a plethora of suitable techniques as well as prototypes have been developed and evaluated in recent years. In particular,...
DaWaII (Data Warehouse IntegratIon) is a tool for supporting the various activities related to the integration of multidimensional databases. This problem arises in common scenari...