Sciweavers

468 search results - page 4 / 94
» Automatic Data Extraction from Data-Rich Web Pages
Sort
View
AAAI
2006
13 years 8 months ago
Automatic Wrapper Generation Using Tree Matching and Partial Tree Alignment
This paper is concerned with the problem of structured data extraction from Web pages. The objective of the research is to automatically segment data records in a page, extract da...
Yanhong Zhai, Bing Liu
WWW
2011
ACM
13 years 1 months ago
HyLiEn: a hybrid approach to general list extraction on the web
We consider the problem of automatically extracting general lists from the web. Existing approaches are mostly dependent upon either the underlying HTML markup or the visual struc...
Fabio Fumarola, Tim Weninger, Rick Barber, Donato ...
SIGKDD
2010
111views more  SIGKDD 2010»
13 years 1 months ago
Unexpected results in automatic list extraction on the web
The discovery and extraction of general lists on the Web continues to be an important problem facing the Web mining community. There have been numerous studies that claim to autom...
Tim Weninger, Fabio Fumarola, Rick Barber, Jiawei ...
CIKM
2003
Springer
14 years 3 days ago
Extracting unstructured data from template generated web documents
We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
Ling Ma, Nazli Goharian, Abdur Chowdhury, Misun Ch...
BIBE
2004
IEEE
156views Bioinformatics» more  BIBE 2004»
13 years 10 months ago
GeneWebEx: Gene Annotation Web Extraction, Aggregation, and Updating from Web-Based Biomolecular Databanks
Numerous genomic annotations are currently stored in different web-accessible databanks that scientists need to mine with user-defined queries and in a batch mode to orderly integ...
Marco Masseroli, Andrea Stella, Natalia Meani, Myr...