Sciweavers

JAIR
2010

Constructing Reference Sets from Unstructured, Ungrammatical Text

13 years 10 months ago
Constructing Reference Sets from Unstructured, Ungrammatical Text
Vast amounts of text on the Web are unstructured and ungrammatical, such as classified ads, auction listings, forum postings, etc. We call such text “posts.” Despite their inconsistent structure and lack of grammar, posts are full of useful information. This paper presents work on semi-automatically building tables of relational information, called “reference sets,” by analyzing such posts directly. Reference sets can be applied to a number of tasks such as ontology maintenance and information extraction. Our reference-set construction method starts with just a small amount of background knowledge, and constructs tuples representing the entities in the posts to form a reference set. We also describe an extension to this approach for the special case where even this small amount of background knowledge is impossible to discover and use. To evaluate the utility of the machineconstructed reference sets, we compare them to manually constructed reference sets in the context of ref...
Matthew Michelson, Craig A. Knoblock
Added 28 Jan 2011
Updated 28 Jan 2011
Type Journal
Year 2010
Where JAIR
Authors Matthew Michelson, Craig A. Knoblock
Comments (0)