Sciweavers

PODS
2012
ACM
276views Database» more  PODS 2012»
11 years 10 months ago
Randomized algorithms for tracking distributed count, frequencies, and ranks
We show that randomization can lead to significant improvements for a few fundamental problems in distributed tracking. Our basis is the count-tracking problem, where there are k...
Zengfeng Huang, Ke Yi, Qin Zhang
PODS
2012
ACM
297views Database» more  PODS 2012»
11 years 10 months ago
Query-based data pricing
Data is increasingly being bought and sold online, and Webbased marketplace services have emerged to facilitate these activities. However, current mechanisms for pricing data are ...
Paraschos Koutris, Prasang Upadhyaya, Magdalena Ba...
PODS
2012
ACM
281views Database» more  PODS 2012»
11 years 10 months ago
Mergeable summaries
We study the mergeability of data summaries. Informally speaking, mergeability requires that, given two summaries on two data sets, there is a way to merge the two summaries into ...
Pankaj K. Agarwal, Graham Cormode, Zengfeng Huang,...
SIGMOD
2012
ACM
345views Database» more  SIGMOD 2012»
11 years 10 months ago
Shark: fast data analysis using coarse-grained distributed memory
Shark is a research data analysis system built on a novel rained distributed shared-memory abstraction. Shark marries query processing with deep data analysis, providing a unifie...
Cliff Engle, Antonio Lupher, Reynold Xin, Matei Za...
SIGMOD
2012
ACM
225views Database» more  SIGMOD 2012»
11 years 10 months ago
A model-based approach to attributed graph clustering
Zhiqiang Xu, Yiping Ke, Yi Wang, Hong Cheng, James...
SIGMOD
2012
ACM
209views Database» more  SIGMOD 2012»
11 years 10 months ago
Locality-sensitive hashing scheme based on dynamic collision counting
Locality-Sensitive Hashing (LSH) and its variants are wellknown methods for solving the c-approximate NN Search problem in high-dimensional space. Traditionally, several LSH funct...
Junhao Gan, Jianlin Feng, Qiong Fang, Wilfred Ng
SIGMOD
2012
ACM
234views Database» more  SIGMOD 2012»
11 years 10 months ago
Oracle in-database hadoop: when mapreduce meets RDBMS
Big data is the tar sands of the data world: vast reserves of raw gritty data whose valuable information content can only be extracted at great cost. MapReduce is a popular parall...
Xueyuan Su, Garret Swart
SIGMOD
2012
ACM
234views Database» more  SIGMOD 2012»
11 years 10 months ago
BloomUnit: declarative testing for distributed programs
We present BloomUnit, a testing framework for distributed programs written in the Bloom language. BloomUnit allows developers to write declarative test specifications that descri...
Peter Alvaro, Andrew Hutchinson, Neil Conway, Will...
SIGMOD
2012
ACM
222views Database» more  SIGMOD 2012»
11 years 10 months ago
Tiresias: a demonstration of how-to queries
In this demo, we will present Tiresias, the first how-to query engine. How-to queries represent fundamental data analysis questions of the form: “How should the input change in...
Alexandra Meliou, Yisong Song, Dan Suciu
SIGMOD
2012
ACM
253views Database» more  SIGMOD 2012»
11 years 10 months ago
Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems
The advent of affordable, shared-nothing computing systems portends a new class of parallel database management systems (DBMS) for on-line transaction processing (OLTP) applicatio...
Andrew Pavlo, Carlo Curino, Stanley B. Zdonik