There has been a significant amount of excitement and recent work on column-oriented database systems ("column-stores"). These database systems have been shown to perfor...
We present SchemaScope, a system to derive Document Type Definitions and XML Schemas from corpora of sample XML documents. Tools are provided to visualize, clean, and refine exist...
How can we efficiently find a clustering, i.e. a concise description of the cluster structure, of a given data set which contains an unknown number of clusters of different shape ...
Today's query processing engines do not take advantage of the multiple occurrences of a relation in a query to improve performance. Instead, each instance is treated as a dis...
Yu Cao, Gopal C. Das, Chee Yong Chan, Kian-Lee Tan
The storage manager of a general-purpose database system can retain consistent disk page level snapshots and run application programs "back-in-time" against long-lived p...
Uncertain data is inherent in a few important applications such as environmental surveillance and mobile object tracking. Top-k queries (also known as ranking queries) are often n...
Freebase is a practical, scalable tuple database used to structure general human knowledge. The data in Freebase is collaboratively created, structured, and maintained. Freebase c...
Kurt D. Bollacker, Colin Evans, Praveen Paritosh, ...
As the size of an RFID tag becomes smaller and the price of the tag gets lower, RFID technology has been applied to a wide range of areas. Recently, RFID has been adopted in the b...
We present Stretch `n' Shrink, a query design framework that explicitly takes into account user preferences about the desired answer size, and subsequently modifies the query...