Search Sciweavers | Sciweavers

2677 search results - page 152 / 536

» Extracting Structured Data from Web Pages

136

Voted

WWW
2003
ACM

133views Internet Technology» more WWW 2003»

Efficient URL caching for world wide web crawling

16 years 3 months ago

Download research.microsoft.com

Crawling the web is deceptively simple: the basic algorithm is (a) Fetch a page (b) Parse it to extract all linked URLs (c) For all the URLs not seen before, repeat (a)?(c). Howev...

Andrei Z. Broder, Marc Najork, Janet L. Wiener

claim paper

Read More »

148

click to vote

WWW
2011
ACM

182views Internet Technology» more WWW 2011»

FACTO: a fact lookup engine based on web tables

14 years 9 months ago

Download research.microsoft.com

Recently answers for fact lookup queries have appeared on major search engines. For example, for the query {Barack Obama date of birth} Google directly shows “4 August 1961” a...

Xiaoxin Yin, Wenzhao Tan, Chao Liu

claim paper

Read More »

110

click to vote

DOCENG
2004
ACM

98views Document Analysis» more DOCENG 2004»

The lifecycle of a digital historical document: structure and content

15 years 8 months ago

Download www.cse.salford.ac.uk

This paper describes the lifecycle of a digital historical document, from template-based structure definition through to content extraction from the scanned pages and its final re...

Apostolos Antonacopoulos, Dimosthenis Karatzas, He...

claim paper

Read More »

119

click to vote

IAT
2008
IEEE

183views Intelligent Agents» more IAT 2008»

Acquiring Vague Temporal Information from the Web

15 years 9 months ago

Download users.ugent.be

Many real–world information needs are naturally formulated as queries with temporal constraints. However, the structured temporal background information needed to support such c...

Steven Schockaert, Martine De Cock, Etienne E. Ker...

claim paper

Read More »

105

Voted

DEXAW
2002
IEEE

145views Database» more DEXAW 2002»

An Architecture for Collaboratively Assembled Moderated Information Bearing Web Sites

15 years 7 months ago

Download www.dcs.gla.ac.uk

As originally conceived, the World Wide Web was intended for the purpose of sharing information. Many websites realise this aim by publishing pages from a data repository which su...

Richard Cooper

claim paper

Read More »

« Prev « First page 152 / 536 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers