Sciweavers

1541 search results - page 20 / 309
» Extracting Web Data Using Instance-Based Learning
Sort
View
ACL
2010
13 years 6 months ago
Learning 5000 Relational Extractors
Many researchers are trying to use information extraction (IE) to create large-scale knowledge bases from natural language text on the Web. However, the primary approach (supervis...
Raphael Hoffmann, Congle Zhang, Daniel S. Weld
ICPR
2004
IEEE
14 years 10 months ago
Relevant Linear Feature Extraction Using Side-information and Unlabeled Data
"Learning with side-information" is attracting more and more attention in machine learning problems. In this paper, we propose a general iterative framework for relevant...
Changshui Zhang, Fei Wu, Yonglei Zhou
IJCAI
2003
13 years 10 months ago
Information Extraction from Web Documents Based on Local Unranked Tree Automaton Inference
Information extraction (IE) aims at extracting specific information from a collection of documents. A lot of previous work on 10 from semi-structured documents (in XML or HTML) us...
Raymond Kosala, Maurice Bruynooghe, Jan Van den Bu...
WWW
2001
ACM
14 years 9 months ago
Effective Web data extraction with standard XML technologies
We discuss the problem of Web data extraction and describe an XML-based methodology whose goal extends far beyond simple "screen scraping." An ideal data extraction proc...
Jussi Myllymaki
WIDM
2003
ACM
14 years 2 months ago
Datarover: a taxonomy based crawler for automated data extraction from data-intensive websites
The advent of e-commerce has created a trend that brought thousands of catalogs online. Most of these websites are “taxonomy-directed”. A Web site is said to be ``taxonomydire...
Hasan Davulcu, S. Koduri, Saravanakumar Nagarajan