Sciweavers

1711 search results - page 307 / 343
» Link Based Clustering of Web Search Results
Sort
View
AIRWEB
2007
Springer
14 years 2 months ago
Splog Detection Using Self-similarity Analysis on Blog Temporal Dynamics
This paper focuses on spam blog (splog) detection. Blogs are highly popular, new media social communication mechanisms. The presence of splogs degrades blog search results as well...
Yu-Ru Lin, Hari Sundaram, Yun Chi, Jun'ichi Tatemu...
WWW
2007
ACM
14 years 9 months ago
GigaHash: scalable minimal perfect hashing for billions of urls
A minimal perfect function maps a static set of keys on to the range of integers {0,1,2, ... , - 1}. We present a scalable high performance algorithm based on random graphs for ...
Kumar Chellapilla, Anton Mityagin, Denis Xavier Ch...
CIKM
2010
Springer
13 years 7 months ago
Using Wikipedia categories for compact representations of chemical documents
Today, Web pages are usually accessed using text search engines, whereas documents stored in the deep Web are accessed through domain-specific Web portals. These portals rely on e...
Benjamin Köhncke, Wolf-Tilo Balke
KDD
2004
ACM
136views Data Mining» more  KDD 2004»
14 years 9 months ago
A cross-collection mixture model for comparative text mining
In this paper, we define and study a novel text mining problem, which we refer to as Comparative Text Mining (CTM). Given a set of comparable text collections, the task of compara...
ChengXiang Zhai, Atulya Velivelli, Bei Yu
SIGSOFT
2005
ACM
14 years 2 months ago
Detecting higher-level similarity patterns in programs
Cloning in software systems is known to create problems during software maintenance. Several techniques have been proposed to detect the same or similar code fragments in software...
Hamid Abdul Basit, Stan Jarzabek