Sciweavers

1541 search results - page 20 / 309
» Extracting Web Data Using Instance-Based Learning
Sort
View
ACL
2010
15 years 4 days ago
Learning 5000 Relational Extractors
Many researchers are trying to use information extraction (IE) to create large-scale knowledge bases from natural language text on the Web. However, the primary approach (supervis...
Raphael Hoffmann, Congle Zhang, Daniel S. Weld
ICPR
2004
IEEE
16 years 3 months ago
Relevant Linear Feature Extraction Using Side-information and Unlabeled Data
"Learning with side-information" is attracting more and more attention in machine learning problems. In this paper, we propose a general iterative framework for relevant...
Changshui Zhang, Fei Wu, Yonglei Zhou
IJCAI
2003
15 years 3 months ago
Information Extraction from Web Documents Based on Local Unranked Tree Automaton Inference
Information extraction (IE) aims at extracting specific information from a collection of documents. A lot of previous work on 10 from semi-structured documents (in XML or HTML) us...
Raymond Kosala, Maurice Bruynooghe, Jan Van den Bu...
WWW
2001
ACM
16 years 2 months ago
Effective Web data extraction with standard XML technologies
We discuss the problem of Web data extraction and describe an XML-based methodology whose goal extends far beyond simple "screen scraping." An ideal data extraction proc...
Jussi Myllymaki
WIDM
2003
ACM
15 years 7 months ago
Datarover: a taxonomy based crawler for automated data extraction from data-intensive websites
The advent of e-commerce has created a trend that brought thousands of catalogs online. Most of these websites are “taxonomy-directed”. A Web site is said to be ``taxonomydire...
Hasan Davulcu, S. Koduri, Saravanakumar Nagarajan