Sciweavers

471 search results - page 14 / 95
» MapReduce: Simplified Data Processing on Large Clusters
Sort
View
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 2 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
WWW
2009
ACM
14 years 8 months ago
Smart Miner: a new framework for mining large scale web usage data
In this paper, we propose a novel framework called SmartMiner for web usage mining problem which uses link information for producing accurate user sessions and frequent navigation...
Murat Ali Bayir, Ismail Hakki Toroslu, Ahmet Cosar...
SIGMOD
2012
ACM
226views Database» more  SIGMOD 2012»
11 years 10 months ago
SkewTune: mitigating skew in mapreduce applications
We present an automatic skew mitigation approach for userdefined MapReduce programs and present SkewTune, a system that implements this approach as a drop-in replacement for an e...
YongChul Kwon, Magdalena Balazinska, Bill Howe, Je...
BTW
2007
Springer
140views Database» more  BTW 2007»
14 years 1 months ago
SmurfPDMS: A Platform for Query Processing in Large-Scale PDMS
: As Peer Data Management Systems (PDMS) are a focus of current research, there are lots of approaches like query processing or routing issues that have to be evaluated. Since ther...
Katja Hose, Christian Lemke, Jana Quasebarth, Kai-...
KDD
2004
ACM
624views Data Mining» more  KDD 2004»
14 years 1 months ago
Programming the K-means clustering algorithm in SQL
Using SQL has not been considered an efficient and feasible way to implement data mining algorithms. Although this is true for many data mining, machine learning and statistical a...
Carlos Ordonez