Sciweavers

391 search results - page 3 / 79
» Finding and Extracting Data Records from Web Pages
Sort
View
127
Voted
SIGKDD
2010
111views more  SIGKDD 2010»
14 years 10 months ago
Unexpected results in automatic list extraction on the web
The discovery and extraction of general lists on the Web continues to be an important problem facing the Web mining community. There have been numerous studies that claim to autom...
Tim Weninger, Fabio Fumarola, Rick Barber, Jiawei ...
143
Voted
AAAI
2006
15 years 5 months ago
Automatic Wrapper Generation Using Tree Matching and Partial Tree Alignment
This paper is concerned with the problem of structured data extraction from Web pages. The objective of the research is to automatically segment data records in a page, extract da...
Yanhong Zhai, Bing Liu
143
Voted
COLING
2010
14 years 10 months ago
A Novel Method for Bilingual Web Page Acquisition from Search Engine Web Records
A new approach has been developed for acquiring bilingual web pages from the result pages of search engines, which is composed of two challenging tasks. The first task is to detec...
Yanhui Feng, Yu Hong, Zhenxiang Yan, Jian-Min Yao,...
122
Voted
DEXAW
2004
IEEE
130views Database» more  DEXAW 2004»
15 years 7 months ago
Data Extraction from Web Data Sources
This paper provides an explanation of the basic data structures used in a new page analysis technique to create wrappers (data extractors) for the result pages produced by web sit...
Jerome Robinson
147
Voted
SIGIR
2005
ACM
15 years 9 months ago
Title extraction from bodies of HTML documents and its application to web page retrieval
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...