Sciweavers

DMKD
2004
ACM
121views Data Mining» more  DMKD 2004»
14 years 3 months ago
Discovery of ads web hosts through traffic data analysis
One of the most actual problems on web crawling
V. Bacarella, Fosca Giannotti, Mirco Nanni, Dino P...
WIDM
2004
ACM
14 years 4 months ago
Probabilistic models for focused web crawling
A Focused crawler must use information gleaned from previously crawled page sequences to estimate the relevance of a newly seen URL. Therefore, good performance depends on powerfu...
Hongyu Liu, Evangelos E. Milios, Jeannette Janssen
ICDE
2005
IEEE
148views Database» more  ICDE 2005»
14 years 5 months ago
Simulation Study of Language Specific Web Crawling
The Web has been recognized as an important part of our cultural heritage. Many nations started archiving national web spaces for future generations. A key technology for data acqu...
Kulwadee Somboonviwat, Masaru Kitsuregawa, Takayuk...
WEBI
2009
Springer
14 years 6 months ago
Learning Deep Web Crawling with Diverse Features
—The key to Deep Web crawling is to submit promising keywords to query form and retrieve Deep Web content efficiently. To select keywords, existing methods make a decision based ...
Lu Jiang, Zhaohui Wu, Qinghua Zheng, Jun Liu
ADMA
2009
Springer
142views Data Mining» more  ADMA 2009»
14 years 6 months ago
Crawling Deep Web Using a New Set Covering Algorithm
Abstract. Crawling the deep web often requires the selection of an appropriate set of queries so that they can cover most of the documents in the data source with low cost. This ca...
Yan Wang, Jianguo Lu, Jessica Chen