Sciweavers

4177 search results - page 696 / 836
» On the Performance of Ant-based Clustering
Sort
View
WWW
2008
ACM
14 years 10 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
WWW
2007
ACM
14 years 10 months ago
Using d-gap patterns for index compression
Sequential patterns of d-gaps exist pervasively in inverted lists of Web document collection indices due to the cluster property. In this paper the information of d-gap sequential...
Jinlin Chen, Terry Cook
WWW
2006
ACM
14 years 10 months ago
GoGetIt!: a tool for generating structure-driven web crawlers
We present GoGetIt!, a tool for generating structure-driven crawlers that requires a minimum effort from the users. The tool takes as input a sample page and an entry point to a W...
Altigran Soares da Silva, Edleno Silva de Moura, J...
WWW
2006
ACM
14 years 10 months ago
The impact of online music services on the demand for stars in the music industry
The music industry's business model is to produce stars. In order to do so, musicians producing music that fits into well defined clusters of factors explaining the demand of...
Ian Pascal Volz
WWW
2006
ACM
14 years 10 months ago
Probabilistic models for discovering e-communities
The increasing amount of communication between individuals in e-formats (e.g. email, Instant messaging and the Web) has motivated computational research in social network analysis...
Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, H...