Search Sciweavers | Sciweavers

563 search results - page 42 / 113

» Crawling the web for structured documents

147

click to vote

WWW
2003
ACM

99views Internet Technology» more WWW 2003»

The XML web: a first study

16 years 4 months ago

Download www.cs.toronto.edu

Although originally designed for large-scale electronic publishing, XML plays an increasingly important role in the exchange of data on the Web. In fact, it is expected that XML w...

Laurent Mignet, Denilson Barbosa, Pierangelo Veltr...

claim paper

Read More »

146

click to vote

CIDR
2009

129views Algorithms» more CIDR 2009»

Extracting and Querying a Comprehensive Web Database

15 years 5 months ago

Download turing.cs.washington.edu

Recent research in domain-independent information extraction holds the promise of an automatically-constructed structured database derived from the Web. A query system based on th...

Michael J. Cafarella

claim paper

Read More »

158

click to vote

SIGIR
2009
ACM

153views Information Technology» more SIGIR 2009»

Building enriched document representations using aggregated anchor text

15 years 10 months ago

Download ciir.cs.umass.edu

It is well known that anchor text plays a critical role in a variety of search tasks performed over hypertextual domains, including enterprise search, wiki search, and web search....

Donald Metzler, Jasmine Novak, Hang Cui, Srihari R...

claim paper

Read More »

161

click to vote

WWW
2008
ACM

163views Internet Technology» more WWW 2008»

As we may perceive: finding the boundaries of compound documents on the web

16 years 4 months ago

Download www2008.org

This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...

Pavel Dmitriev

claim paper

Read More »

154

click to vote

ACSC
2006
IEEE

147views Theoretical Computer Science» more ACSC 2006»

Using formal concept analysis with an incremental knowledge acquisition system for web document management

15 years 10 months ago

Download eprints.utas.edu.au

It is necessary to provide a method to store Web information effectively so it can be utilised as a future knowledge resource. A commonly adopted approach is to classify the retri...

Timothy J. Everts, Sung Sik Park, Byeong Ho Kang

claim paper

Read More »

« Prev « First page 42 / 113 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers