Search Sciweavers | Sciweavers

391 search results - page 3 / 79

» Finding and Extracting Data Records from Web Pages

127

Voted

SIGKDD
2010

111views more SIGKDD 2010»

Unexpected results in automatic list extraction on the web

14 years 10 months ago

Download www.sigkdd.org

The discovery and extraction of general lists on the Web continues to be an important problem facing the Web mining community. There have been numerous studies that claim to autom...

Tim Weninger, Fabio Fumarola, Rick Barber, Jiawei ...

claim paper

Read More »

143

Voted

AAAI
2006

233views Intelligent Agents» more AAAI 2006»

Automatic Wrapper Generation Using Tree Matching and Partial Tree Alignment

15 years 5 months ago

Download www.aaai.org

This paper is concerned with the problem of structured data extraction from Web pages. The objective of the research is to automatically segment data records in a page, extract da...

Yanhong Zhai, Bing Liu

claim paper

Read More »

143

Voted

COLING
2010

187views Computational Linguistics» more COLING 2010»

A Novel Method for Bilingual Web Page Acquisition from Search Engine Web Records

14 years 10 months ago

Download www.aclweb.org

A new approach has been developed for acquiring bilingual web pages from the result pages of search engines, which is composed of two challenging tasks. The first task is to detec...

Yanhui Feng, Yu Hong, Zhenxiang Yan, Jian-Min Yao,...

claim paper

Read More »

122

Voted

DEXAW
2004
IEEE

130views Database» more DEXAW 2004»

Data Extraction from Web Data Sources

15 years 7 months ago

Download www.essex.ac.uk

This paper provides an explanation of the basic data structures used in a new page analysis technique to create wrappers (data extractors) for the result pages produced by web sit...

Jerome Robinson

claim paper

Read More »

147

Voted

SIGIR
2005
ACM

156views Information Technology» more SIGIR 2005»

Title extraction from bodies of HTML documents and its application to web page retrieval

15 years 9 months ago

Download research.microsoft.com

This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...

Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...

claim paper

Read More »

« Prev « First page 3 / 79 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers