Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
In this paper, we propose a novel framework called SmartMiner for web usage mining problem which uses link information for producing accurate user sessions and frequent navigation...
Murat Ali Bayir, Ismail Hakki Toroslu, Ahmet Cosar...
We present an automatic skew mitigation approach for userdefined MapReduce programs and present SkewTune, a system that implements this approach as a drop-in replacement for an e...
YongChul Kwon, Magdalena Balazinska, Bill Howe, Je...
: As Peer Data Management Systems (PDMS) are a focus of current research, there are lots of approaches like query processing or routing issues that have to be evaluated. Since ther...
Katja Hose, Christian Lemke, Jana Quasebarth, Kai-...
Using SQL has not been considered an efficient and feasible way to implement data mining algorithms. Although this is true for many data mining, machine learning and statistical a...