Sciweavers

295 search results - page 44 / 59
» Web Crawling
Sort
View
ADAPTIVE
2007
Springer
14 years 4 months ago
Web Document Modeling
A very common issue of adaptive Web-Based systems is the modeling of documents. Such documents represent domain-specific information for a number of purposes. Application areas su...
Alessandro Micarelli, Filippo Sciarrone, Mauro Mar...
WWW
2007
ACM
14 years 10 months ago
Efficient search in large textual collections with redundancy
Current web search engines focus on searching only the most recent snapshot of the web. In some cases, however, it would be desirable to search over collections that include many ...
Jiangong Zhang, Torsten Suel
WAW
2004
Springer
150views Algorithms» more  WAW 2004»
14 years 3 months ago
Do Your Worst to Make the Best: Paradoxical Effects in PageRank Incremental Computations
d Abstract) Paolo Boldi† Massimo Santini‡ Sebastiano Vigna∗ Deciding which kind of visit accumulates high-quality pages more quickly is one of the most often debated issue i...
Paolo Boldi, Massimo Santini, Sebastiano Vigna
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 4 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
WWW
2005
ACM
14 years 10 months ago
Analyzing online discussion for marketing intelligence
We present a system that gathers and analyzes online discussion as it relates to consumer products. Weblogs and online message boards provide forums that record the voice of the p...
Natalie S. Glance, Matthew Hurst, Kamal Nigam, Mat...