This paper presents a system for enabling offline web use to satisfy the information needs of disconnected communities. We describe the design, implementation, evaluation, and pil...
Jay Chen, Russell Power, Lakshminarayanan Subraman...
We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits the structure of the Web graph as we...
Jacob Abernethy, Olivier Chapelle, Carlos Castillo
In this paper we study duplicates on the Web, using collections containing documents of all sites under the .cl domain that represent accurate and representative subsets of the We...
The advent of e-commerce has created a trend that brought thousands of catalogs online. Most of these websites are “taxonomy-directed”. A Web site is said to be ``taxonomydire...
Intelligence of humankind mostly includes five parts: the observing ability, the memory ability, the practice ability, the thought ability, the imagining ability, etc.. In this pa...