Search Sciweavers | Sciweavers

139 search results - page 20 / 28

» An Approach to Identify Duplicated Web Pages

click to vote

WIDM
2006
ACM

148views Internet Technology» more WIDM 2006»

Coarse-grained classification of web sites by their structural properties

14 years 1 months ago

Download rvs.informatik.uni-leipzig.de

In this paper, we identify and analyze structural properties which reflect the functionality of a Web site. These structural properties consider the size, the organization, the co...

Christoph Lindemann, Lars Littig

claim paper

Read More »

click to vote

WEBI
2009
Springer

115views Internet Technology» more WEBI 2009»

Mining a Multilingual Geographical Gazetteer from the Web

14 years 2 months ago

Download comupedia.org

Geographical gazetteers are necessary in a wide variety of applications. In the past, the construction of such gazetteers has been a tedious, manual process and only recently have...

Adrian Popescu, Gregory Grefenstette, Houda Bouamo...

claim paper

Read More »

click to vote

AUSAI
2003
Springer

81views Artificial Intelligence» more AUSAI 2003»

Information Extraction via Path Merging

14 years 28 days ago

Download www.ict.csiro.au

Abstract. In this paper, we describe a new approach to information extraction that neatly integrates top-down hypothesis driven information with bottom-up data driven information. ...

Robert Dale, Cécile Paris, Marc Tilbrook

claim paper

Read More »

click to vote

LREC
2010

216views Education» more LREC 2010»

BlogBuster: A Tool for Extracting Corpora from the Blogosphere

13 years 9 months ago

Download www.lrec-conf.org

This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, ...

Georgios Petasis, Dimitrios Petasis

claim paper

Read More »

click to vote

KCAP
2005
ACM

165views Information Technology» more KCAP 2005»

AutoFeed: an unsupervised learning system for generating webfeeds

14 years 1 months ago

Download www.isi.edu

The AutoFeed system automatically extracts data from semistructured web sites. Previously, researchers have developed two types of supervised learning approaches for extracting we...

Bora Gazen, Steven Minton

claim paper

Read More »

« Prev « First page 20 / 28 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers