Sciweavers

62 search results - page 1 / 13
» Learning Page-Independent Heuristics for Extracting Data fro...
Sort
View
SYNASC
2006
IEEE
211views Algorithms» more  SYNASC 2006»
14 years 5 months ago
HTML Pattern Generator--Automatic Data Extraction from Web Pages
Existing methods of information extraction from HTML documents include manual approach, supervised learning and automatic techniques. The manual method has high precision and reca...
Mirel Cosulschi, Adrian Giurca, Bogdan Udrescu, Ni...
SIGMOD
2003
ACM
190views Database» more  SIGMOD 2003»
14 years 4 months ago
Extracting Structured Data from Web Pages
Many web sites contain large sets of pages generated using a common template or layout. For example, Amazon lays out the author, title, comments, etc. in the same way in all its b...
Arvind Arasu, Hector Garcia-Molina
WISE
2005
Springer
14 years 4 months ago
Extracting Web Data Using Instance-Based Learning
This paper studies structured data extraction from Web pages, e.g., online product description pages. Existing approaches to data extraction include wrapper induction and automatic...
Yanhong Zhai, Bing Liu
ICDM
2002
IEEE
162views Data Mining» more  ICDM 2002»
14 years 3 months ago
Recognition of Common Areas in a Web Page Using Visual Information: a possible application in a page classification
Extracting and processing information from web pages is an important task in many areas like constructing search engines, information retrieval, and data mining from the Web. Comm...
Milos Kovacevic, Michelangelo Diligenti, Marco Gor...