Search Sciweavers | Sciweavers

684 search results - page 15 / 137

» Extracting semantic structure of web documents using content...

165

click to vote

WEBDB
1999
Springer

196views Database» more WEBDB 1999»

Web Ecology: Recycling HTML Pages as XML Documents Using W4F

15 years 9 months ago

Download db.cis.upenn.edu

In this paper we present the World-Wide Web Wrapper Factory (W4F), a Java toolkit to generate wrappers for Web data sources. Some key features of W4F are an expressive language to...

Arnaud Sahuguet, Fabien Azavant

claim paper

Read More »

157

click to vote

ISCIS
2003
Springer

138views Information Technology» more ISCIS 2003»

A Cooperative Paradigm for Fighting Information Overload

15 years 10 months ago

Download di002.edv.uniovi.es

The Web is mainly processed by humans. The role of the machines is just to transmit and display the contents of the documents, barely being able to do something else. Nowadays ther...

Daniel Gayo-Avello, Darío Álvarez Gu...

claim paper

Read More »

126

click to vote

ICIP
2003
IEEE

130views Image Processing» more ICIP 2003»

Structuralizing educational videos based on presentation content

16 years 6 months ago

Download www.aquaphoenix.com

This work addresses the challenge of extracting structure in educational and training media based on the type of material that is presented during lectures and training sessions. ...

Chitra Dorai, Vincent Oria, Viswanath Neelavalli

claim paper

Read More »

177

click to vote

WWW
2010
ACM

188views Internet Technology» more WWW 2010»

Exploiting content redundancy for web information extraction

15 years 5 months ago

Download www.comp.nus.edu.sg

We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...

Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...

claim paper

Read More »

175

click to vote

HT
1996
ACM

175views Internet Technology» more HT 1996»

HyPursuit: A Hierarchical Network Search Engine that Exploits Content-Link Hypertext Clustering

15 years 9 months ago

Download www.psrg.lcs.mit.edu

HyPursuit is a new hierarchical network search engine that clusters hypertext documents to structure a given information space for browsing and search activities. Our content-link...

Ron Weiss, Bienvenido Vélez, Mark A. Sheldo...

claim paper

Read More »

« Prev « First page 15 / 137 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers