Search Sciweavers | Sciweavers

563 search results - page 44 / 113

» Crawling the web for structured documents

165

click to vote

CACM
1998

110views more CACM 1998»

Viewing WISs as Database Applications

15 years 3 months ago

Download www.cs.toronto.edu

abstraction for modeling these problems is to view the Web as a collection of (usually small and heterogeneous) databases, and to view programs that extract and process Web data au...

Gustavo O. Arocena, Alberto O. Mendelzon

claim paper

Read More »

115

click to vote

CIKM
2005
Springer

126views Information Technology» more CIKM 2005»

Structure-based query-specific document summarization

15 years 9 months ago

Download users.cis.fiu.edu

Summarization of text documents is increasingly important with the amount of data available on the Internet. The large majority of current approaches view documents as linear sequ...

Ramakrishna Varadarajan, Vagelis Hristidis

claim paper

Read More »

147

click to vote

SEBD
2007

89views Database» more SEBD 2007»

Disambiguation of Structure-Based Information in the STRIDER System

15 years 5 months ago

Download www.isgroup.unimo.it

We present the current version of STRIDER1 , a versatile system for the disambiguation of structure-based information like XML schemas, structures of XML documents and web director...

Federica Mandreoli, Riccardo Martoglia, Enrico Ron...

claim paper

Read More »

144

click to vote

CIKM
2009
Springer

140views Information Technology» more CIKM 2009»

Compact full-text indexing of versioned document collections

15 years 10 months ago

Download cis.poly.edu

We study the problem of creating highly compressed fulltext index structures for versioned document collections, that is, collections that contain multiple versions of each docume...

Jinru He, Hao Yan, Torsten Suel

claim paper

Read More »

187

click to vote

WWW
2011
ACM

316views Internet Technology» more WWW 2011»

Identifying primary content from web pages and its application to web search ranking

14 years 10 months ago

Download www.www2011india.com

Web pages are usually highly structured documents. In some documents, content with diﬀerent functionality is laid out in blocks, some merely supporting the main discourse. In ot...

Srinivas Vadrevu, Emre Velipasaoglu

claim paper

Read More »

« Prev « First page 44 / 113 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers