Search Sciweavers | Sciweavers

72 search results - page 8 / 15

» Ontology-Focused Crawling of Web Documents

click to vote

WWW
2009
ACM

135views Internet Technology» more WWW 2009»

User-centric content freshness metrics for search engines

14 years 8 months ago

Download www2009.org

In order to return relevant search results, a search engine must keep its local repository synchronized to the Web, but it is usually impossible to attain perfect freshness. Hence...

Ali Dasdan, Xinh Huynh

claim paper

Read More »

click to vote

ADAPTIVE
2007
Springer

240views Internet Technology» more ADAPTIVE 2007»

Web Document Modeling

14 years 2 months ago

Download www.dcs.warwick.ac.uk

A very common issue of adaptive Web-Based systems is the modeling of documents. Such documents represent domain-speciﬁc information for a number of purposes. Application areas su...

Alessandro Micarelli, Filippo Sciarrone, Mauro Mar...

claim paper

Read More »

click to vote

PVLDB
2008

141views more PVLDB 2008»

WebTables: exploring the power of tables on the web

13 years 7 months ago

Download turing.cs.washington.edu

The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...

Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...

claim paper

Read More »

click to vote

WWW
2003
ACM

131views Internet Technology» more WWW 2003»

Dynamic maintenance of web indexes using landmarks

14 years 8 months ago

Download www.research.ibm.com

Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...

Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...

claim paper

Read More »

click to vote

CLEF
2005
Springer

115views Information Technology» more CLEF 2005»

EuroGOV: Engineering a Multilingual Web Corpus

14 years 1 months ago

Download www.clef-campaign.org

EuroGOV is a multilingual web corpus that was created to serve as the document collection for WebCLEF, the CLEF 2005 web retrieval task. EuroGOV is a collection of web pages crawl...

Börkur Sigurbjörnsson, Jaap Kamps, Maart...

claim paper

Read More »

« Prev « First page 8 / 15 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers