Sciweavers

PVLDB
2008
182views more  PVLDB 2008»
13 years 12 months ago
SCOPE: easy and efficient parallel processing of massive data sets
Companies providing cloud-scale services have an increasing need to store and analyze massive data sets such as search logs and click streams. For cost and performance reasons, pr...
Ronnie Chaiken, Bob Jenkins, Per-Åke Larson,...
PVLDB
2008
116views more  PVLDB 2008»
13 years 12 months ago
Tighter estimation using bottom k sketches
Summaries of massive data sets support approximate query processing over the original data. A basic aggregate over a set of records is the weight of subpopulations specified as a ...
Edith Cohen, Haim Kaplan
PVLDB
2008
137views more  PVLDB 2008»
13 years 12 months ago
Flashing up the storage layer
In the near future, commodity hardware is expected to incorporate both flash and magnetic disks. In this paper we study how the storage layer of a database system can benefit from...
Ioannis Koltsidas, Stratis Viglas
PVLDB
2008
107views more  PVLDB 2008»
13 years 12 months ago
A pay-as-you-go framework for query execution feedback
Past work has suggested that query execution feedback can be useful in improving the quality of plans by correcting cardinality estimation errors in the query optimizer. The state...
Surajit Chaudhuri, Vivek R. Narasayya, Ravishankar...
PVLDB
2008
116views more  PVLDB 2008»
13 years 12 months ago
Accuracy estimate and optimization techniques for SimRank computation
The measure of similarity between objects is a very useful tool in many areas of computer science, including information retrieval. SimRank is a simple and intuitive measure of th...
Dmitry Lizorkin, Pavel Velikhov, Maxim N. Grinev, ...
PVLDB
2008
58views more  PVLDB 2008»
13 years 12 months ago
Scalable ranked publish/subscribe
Ashwin Machanavajjhala, Erik Vee, Minos N. Garofal...
PVLDB
2008
146views more  PVLDB 2008»
13 years 12 months ago
Efficient search for the top-k probable nearest neighbors in uncertain databases
Uncertainty pervades many domains in our lives. Current real-life applications, e.g., location tracking using GPS devices or cell phones, multimedia feature extraction, and sensor...
George Beskales, Mohamed A. Soliman, Ihab F. Ilyas
PVLDB
2008
77views more  PVLDB 2008»
13 years 12 months ago
Community-driven data grids
Beyond already existing huge data volumes, e-science communities face major challenges in managing the anticipated data deluge of forthcoming projects. Community-driven data grids...
Tobias Scholl, Alfons Kemper
PVLDB
2008
141views more  PVLDB 2008»
13 years 12 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...