Sciweavers

894 search results - page 134 / 179
» Analysis of Web Search Engine Clicked Documents
Sort
View
CPM
2000
Springer
177views Combinatorics» more  CPM 2000»
14 years 4 days ago
Identifying and Filtering Near-Duplicate Documents
Abstract. The mathematical concept of document resemblance captures well the informal notion of syntactic similarity. The resemblance can be estimated using a fixed size “sketch...
Andrei Z. Broder
WWW
2007
ACM
14 years 8 months ago
A large-scale study of robots.txt
Search engines largely rely on Web robots to collect information from the Web. Due to the unregulated open-access nature of the Web, robot activities are extremely diverse. Such c...
Yang Sun, Ziming Zhuang, C. Lee Giles
ELPUB
1998
ACM
14 years 21 hour ago
Research Information Take Away or How to Serve Research Information Fast and Friendly on the Web
In 1997 the library department at the University of Karlskrona/Ronneby was asked to develop a database which could be used to collate and present all the research material and ong...
Peter Linde, Leif Lagebrand
ICDE
2002
IEEE
161views Database» more  ICDE 2002»
14 years 9 months ago
Design and Implementation of a High-Performance Distributed Web Crawler
Broad web search engines as well as many more specialized search tools rely on web crawlers to acquire large collections of pages for indexing and analysis. Such a web crawler may...
Vladislav Shkapenyuk, Torsten Suel
WWW
2006
ACM
14 years 8 months ago
POLYPHONET: an advanced social network extraction system from the web
Social networks play important roles in the Semantic Web: knowledge management, information retrieval, ubiquitous computing, and so on. We propose a social network extraction syst...
Hideaki Takeda, Junichiro Mori, Kôiti Hasida...