Sciweavers

2190 search results - page 29 / 438
» Unweaving a web of documents
Sort
View
CN
1999
242views more  CN 1999»
13 years 8 months ago
Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery
The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines. In this paper we describe a new hypertext resource d...
Soumen Chakrabarti, Martin van den Berg, Byron Dom
LISA
2003
13 years 10 months ago
DryDock: A Document Firewall
Auditing a web site’s content is an arduous task. For any given page on a web server, system administrators are often ill-equipped to determine who created the document, why itâ...
Deepak Giridharagopal
CORR
2007
Springer
114views Education» more  CORR 2007»
13 years 9 months ago
SWI-Prolog and the Web
Prolog is an excellent tool for representing and manipulating data written in formal languages as well as natural language. Its safe semantics and automatic memory management make...
Jan Wielemaker, Zhisheng Huang, Lourens van der Me...
DEXAW
2004
IEEE
104views Database» more  DEXAW 2004»
14 years 22 days ago
Multilingual and Multimedia Information Retrieval from Web Documents
Web documents present new challenges to conventional Information Retrieval (IR) technologies. This paper describes how these challenges are faced in FameIR, a multilingual multime...
Marta Gatius, Manuel Bertrán, Horacio Rodr&...
EP
1998
Springer
14 years 1 months ago
Measuring Structural Similarity Among Web Documents: Preliminary Results
When we describe a Web page informally, we often use phrases like it looks like a newspaper site", there are several unordered lists" or it's just a collection of li...
Isabel F. Cruz, Slava Borisov, Michael A. Marks, T...