Search Sciweavers | Sciweavers

2677 search results - page 34 / 536

» Extracting Structured Data from Web Pages

197

click to vote

ECIR
2009
Springer

155views Information Technology» more ECIR 2009»

PathRank: Web Page Retrieval with Navigation Path

15 years 4 months ago

Download goanna.cs.rmit.edu.au

Abstract. This paper describes a path-based method to use the multi-step navigation information discovered from website structures for web page ranking. Use of hyperlinks to enhanc...

Jianqiang Li, Yu Zhao 0002

claim paper

Read More »

182

click to vote

DATESO
2009

105views Database» more DATESO 2009»

From Web Pages to Web Communities

15 years 4 months ago

Download sunsite.informatik.rwth-aachen.de

In this paper we are looking for a relationship between the intent of Web pages, their architecture and the communities who take part in their usage and creation. From our point of...

Milos Kudelka, Václav Snásel, Zdenek...

claim paper

Read More »

159

click to vote

DEXAW
2008
IEEE

123views Database» more DEXAW 2008»

Text Extraction from the Web via Text-to-Tag Ratio

16 years 1 months ago

Download www.uni-weimar.de

– We describe a method to extract content text from diverse Web pages by using the HTML document’s Text-to-Tag Ratio rather than specific HTML cues that may not be constant acr...

Tim Weninger, William H. Hsu

claim paper

Read More »

152

Voted

ICDM
2002
IEEE

143views Data Mining» more ICDM 2002»

Automatic Web Page Classification in a Dynamic and Hierarchical Way

15 years 12 months ago

Download www2.latech.edu

Automatic classification of web pages is an effective way to deal with the difficulty of retrieving information from the Internet. Although there are many automatic classification...

Xiaogang Peng, Ben Choi

claim paper

Read More »

162

click to vote

LREC
2008

108views Education» more LREC 2008»

A Lightweight and Efficient Tool for Cleaning Web Pages

15 years 8 months ago

Download www.lrec-conf.org

Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...

Stefan Evert

claim paper

Read More »

« Prev « First page 34 / 536 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers