Sciweavers

IJSI
2008

Towards Knowledge Acquisition from Semi-Structured Content

13 years 11 months ago
Towards Knowledge Acquisition from Semi-Structured Content
Abstract A rich family of generic Information Extraction (IE) techniques have been developed by researchers nowadays. This paper proposes WebKER, a system for automatically extracting knowledge from semi-structured content on Web pages based on wrappers and domain ontologies. Within the extracting process, wrappers are learned through suffix arrays. Then domain ontologies automatically align the raw data extracted by wrappers and knowledge are generated by describing the data with Resource Description Framework (RDF) statements. After the merging process, newly generated knowledge are added to the Knowledge Base (KB) finally for users to query regardless of resources' derivation. A prototype of WebKER is implemented. This paper also gives the performance evaluation of this system and the comparison between querying information in the KB and querying information in the traditional database, indicating the superiority of our system. In addition, the evaluation of the outstanding wra...
Xi Bai, Jigui Sun, Haiyan Che, Lian Shi
Added 12 Dec 2010
Updated 12 Dec 2010
Type Journal
Year 2008
Where IJSI
Authors Xi Bai, Jigui Sun, Haiyan Che, Lian Shi
Comments (0)