Sciweavers

32 search results - page 4 / 7
» Reservoir-Based Random Sampling with Replacement from Data S...
Sort
View
SIGMOD
2010
ACM
281views Database» more  SIGMOD 2010»
14 years 15 days ago
Continuous sampling for online aggregation over multiple queries
In this paper, we propose an online aggregation system called COSMOS (Continuous Sampling for Multiple queries in an Online aggregation System), to process multiple aggregate quer...
Sai Wu, Beng Chin Ooi, Kian-Lee Tan
ICDM
2007
IEEE
158views Data Mining» more  ICDM 2007»
14 years 2 months ago
On Appropriate Assumptions to Mine Data Streams: Analysis and Practice
Recent years have witnessed an increasing number of studies in stream mining, which aim at building an accurate model for continuously arriving data. Somehow most existing work ma...
Jing Gao, Wei Fan, Jiawei Han
SDM
2010
SIAM
195views Data Mining» more  SDM 2010»
13 years 9 months ago
MACH: Fast Randomized Tensor Decompositions
Tensors naturally model many real world processes which generate multi-aspect data. Such processes appear in many different research disciplines, e.g, chemometrics, computer visio...
Charalampos E. Tsourakakis
WWW
2005
ACM
14 years 8 months ago
Sampling search-engine results
We consider the problem of efficiently sampling Web search engine query results. In turn, using a small random sample instead of the full set of results leads to efficient approxi...
Aris Anagnostopoulos, Andrei Z. Broder, David Carm...
BMCBI
2007
147views more  BMCBI 2007»
13 years 7 months ago
Bias in random forest variable importance measures: Illustrations, sources and a solution
Variable importance measures for random forests have been receiving increased attention as a means of variable selection in many classification tasks in bioinformatics and relate...
Carolin Strobl, Anne-Laure Boulesteix, Achim Zeile...