Sciweavers

4110 search results - page 766 / 822
» Efficient algorithms for the 2-gathering problem
Sort
View
141
Voted
WWW
2003
ACM
16 years 4 months ago
Text joins in an RDBMS for web data integration
The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identi...
Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas...
KDD
2009
ACM
198views Data Mining» more  KDD 2009»
16 years 4 months ago
Pervasive parallelism in data mining: dataflow solution to co-clustering large and sparse Netflix data
All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...
SIGMOD
2005
ACM
143views Database» more  SIGMOD 2005»
16 years 4 months ago
Holistic Aggregates in a Networked World: Distributed Tracking of Approximate Quantiles
While traditional database systems optimize for performance on one-shot queries, emerging large-scale monitoring applications require continuous tracking of complex aggregates and...
Graham Cormode, Minos N. Garofalakis, S. Muthukris...
SIGMOD
2002
ACM
177views Database» more  SIGMOD 2002»
16 years 4 months ago
Coordinating backup/recovery and data consistency between database and file systems
Managing a combined store consisting of database data and file data in a robust and consistent manner is a challenge for database systems and content management systems. In such a...
Suparna Bhattacharya, C. Mohan, Karen Brannon, Ind...
137
Voted
ICDE
2010
IEEE
219views Database» more  ICDE 2010»
16 years 3 months ago
PIP: A Database System for Great and Small Expectations
Estimation via sampling out of highly selective join queries is well known to be problematic, most notably in online aggregation. Without goal-directed sampling strategies, samples...
Oliver Kennedy, Christoph Koch