Sciweavers

1109 search results - page 47 / 222
» Crawling on web graphs
Sort
View
WWW
2011
ACM
14 years 11 months ago
Design and implementation of contextual information portals
This paper presents a system for enabling offline web use to satisfy the information needs of disconnected communities. We describe the design, implementation, evaluation, and pil...
Jay Chen, Russell Power, Lakshminarayanan Subraman...
AIRWEB
2008
Springer
15 years 6 months ago
Web spam identification through content and hyperlinks
We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits the structure of the Web graph as we...
Jacob Abernethy, Olivier Chapelle, Carlos Castillo
133
Voted
LAWEB
2006
IEEE
15 years 10 months ago
Where and How Duplicates Occur in the Web
In this paper we study duplicates on the Web, using collections containing documents of all sites under the .cl domain that represent accurate and representative subsets of the We...
Álvaro R. Pereira Jr., Ricardo A. Baeza-Yat...
WIDM
2003
ACM
15 years 9 months ago
Datarover: a taxonomy based crawler for automated data extraction from data-intensive websites
The advent of e-commerce has created a trend that brought thousands of catalogs online. Most of these websites are “taxonomy-directed”. A Web site is said to be ``taxonomydire...
Hasan Davulcu, S. Koduri, Saravanakumar Nagarajan
JCIT
2008
96views more  JCIT 2008»
15 years 4 months ago
An Intelligent Model and Its Implementation of Search Engine
Intelligence of humankind mostly includes five parts: the observing ability, the memory ability, the practice ability, the thought ability, the imagining ability, etc.. In this pa...
Yajun Du, Haiming Li