Sciweavers

AAAI
2006

Phoebus: A System for Extracting and Integrating Data from Unstructured and Ungrammatical Sources

14 years 1 months ago
Phoebus: A System for Extracting and Integrating Data from Unstructured and Ungrammatical Sources
With the proliferation of online classifieds and auctions comes a new need to meaningfully search and organize the items for sale. However, since the seller's item descriptions are not structured and do not conform to a standard set of values (think "Chevy" versus "Chevrolet"), searching and organizing this data is difficult. This paper describes a working demonstration of the Phoebus system which uses both record linkage and information extraction to parse out the meaningful attributes of an item description and assign them standard values. This allows the data to be sorted, searched and linked to other data sources where standard values for the attributes are required to link the sources together.
Matthew Michelson, Craig A. Knoblock
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where AAAI
Authors Matthew Michelson, Craig A. Knoblock
Comments (0)