Sciweavers

233 search results - page 27 / 47
» Clustering documents in a web directory
Sort
View
PDP
2008
IEEE
14 years 1 months ago
Bulk-Synchronous On-Line Crawling on Clusters of Computers
This paper describes the design of a crawler devised to perform the periodic retrieval of Web documents for a search engine able to accept on-line updates in a concurrent manner. ...
Mauricio Marín, Carolina Bonacic
ECIR
2006
Springer
13 years 9 months ago
Automatic Document Organization in a P2P Environment
Abstract. This paper describes an efficient method to construct reliable machine learning applications in peer-to-peer (P2P) networks by building ensemble based meta methods. We co...
Stefan Siersdorfer, Sergej Sizov
WEBI
2007
Springer
14 years 1 months ago
K-SVMeans: A Hybrid Clustering Algorithm for Multi-Type Interrelated Datasets
Identification of distinct clusters of documents in text collections has traditionally been addressed by making the assumption that the data instances can only be represented by ...
Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee G...
WWW
2002
ACM
14 years 8 months ago
The structure of broad topics on the web
The Web graph is a giant social network whose properties have been measured and modeled extensively in recent years. Most such studies concentrate on the graph structure alone, an...
Soumen Chakrabarti, Mukul Joshi, Kunal Punera, Dav...
WWW
2009
ACM
14 years 6 days ago
Extracting data records from the web using tag path clustering
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...
Gengxin Miao, Jun'ichi Tatemura, Wang-Pin Hsiung, ...