Improving Rule Generation Precision for Domain Knowledge based Wrappers

16 years 5 days ago

Download www.kristalinfo.com

Wrappers play an important role in extracting specified information from various sources. Wrapper rules by which information is extracted are often created from the domain-specific knowledge. Domain-specific knowledge helps recognizing the meaning the text representing various entities and values and detecting their formats. However, such domain knowledge becomes powerless when value-representing data are not labeled with appropriate textual descriptions or there is nothing but a hyper link when certain text labels or values are expected. In order to alleviate these problems, we propose a probabilistic method for recognizing the entity type, i.e. generating wrapper rules, when there is no label associated with value-representing text. In addition, we have devised a method for using the information reachable by following hyperlinks when textual data are not immediately available on the target web page. Our experimental work shows that the proposed methods help increasing precision of t...

Chang-Hoo Jeong, Sung-Jin Jhun, Myung-Eun Lim, Sun

Real-time Traffic

CIMCA 2005 | Domain-specific Knowledge | Various Sources | Wrapper Rules |

claim paper

» Backward chaining rule induction

» Iterative Learning of Weighted Rule Sets for Greedy Search

» Knowledge Component of a Multiagent Distributed Decision Support System

» Automatic Text Summarization Based on Lexical Chains

» Extracting article text from the web with maximum subsequence segmentation

» Context Query in Information Retrieval

Post Info
More Details (n/a)

Added	24 Jun 2010
Updated	24 Jun 2010
Type	Conference
Year	2005
Where	CIMCA
Authors	Chang-Hoo Jeong, Sung-Jin Jhun, Myung-Eun Lim, Sung-Hyon Myaeng

Comments (0)

Sciweavers

Improving Rule Generation Precision for Domain Knowledge based Wrappers

CIMCA 2005 | Domain-specific Knowledge | Various Sources | Wrapper Rules |

Explore & Download

Productivity Tools

Sciweavers