The high quality, structured data from Web structured sources is invaluable for many applications. Hidden Web databases are not directly crawlable by Web search engines and are on...
Partitioned query processing is an effective method to process continuous queries with large stateful operators in a distributed systems. This method typically partitions input da...
Implementations of map-reduce are being used to perform many operations on very large data. We examine strategies for joining several relations in the map-reduce environment. Our ...
A fundamental problem in data management is to draw a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streamin...
Graham Cormode, S. Muthukrishnan, Ke Yi, Qin Zhang
Large scale data analysis and mining activities, such as identifying interesting trends, making unusual patterns to stand out and verifying hypotheses, require sophisticated infor...