Sciweavers

4526 search results - page 791 / 906
» An overview of clustering methods
Sort
View
WWW
2008
ACM
14 years 11 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
WWW
2007
ACM
14 years 11 months ago
Generative models for name disambiguation
Name ambiguity is a special case of identity uncertainty where one person can be referenced by multiple name variations in different situations or even share the same name with ot...
Yang Song, Jian Huang 0002, Isaac G. Councill, Jia...
WWW
2007
ACM
14 years 11 months ago
Using d-gap patterns for index compression
Sequential patterns of d-gaps exist pervasively in inverted lists of Web document collection indices due to the cluster property. In this paper the information of d-gap sequential...
Jinlin Chen, Terry Cook
WWW
2006
ACM
14 years 11 months ago
Probabilistic models for discovering e-communities
The increasing amount of communication between individuals in e-formats (e.g. email, Instant messaging and the Web) has motivated computational research in social network analysis...
Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, H...
WWW
2004
ACM
14 years 11 months ago
FADA: find all distinct answers
The wealth of information available on the web makes it an attractive resource for seeking quick answers. While web-based question answering becomes an emerging topic in recent ye...
Hui Yang, Tat-Seng Chua