Spatial Database Management Systems (SDBMS), e.g., Geographical Information Systems, that manage spatial objects such as points, lines, and hyper-rectangles, often have very high ...
Data items archived in data warehouses or those that arrive online as streams typically have attributes which take values from multiple hierarchies (e.g., time and geographic loca...
Graham Cormode, Flip Korn, S. Muthukrishnan, Dives...
Microarray datasets typically contain large number of columns but small number of rows. Association rules have been proved to be useful in analyzing such datasets. However, most e...
Gao Cong, Anthony K. H. Tung, Xin Xu, Feng Pan, Ji...
We are developing a distributed query processor called PIER, which is designed to run on the scale of the entire Internet. PIER utilizes a Distributed Hash Table (DHT) as its comm...
Brent N. Chun, Joseph M. Hellerstein, Ryan Huebsch...
We make two main contributions in this paper. First, we motivate and introduce a novel class of data mining problems that arise in labeling a group of mass spectra, specifically f...
We present BLAS , a Bi-LAbeling based System, for efficiently processing complex XPath queries over XML data. BLAS uses Plabeling to process queries involving consecutive child ax...
Block-level sampling is far more efficient than true uniform-random sampling over a large database, but prone to significant errors if used to create database statistics. In this ...
Exploratory ad-hoc queries could return too many answers ? a phenomenon commonly referred to as "information overload". In this paper, we propose to automatically catego...