Corpus Exploitation from Wikipedia for Ontology Construction

15 years 9 months ago

Download www.lrec-conf.org

Ontology construction usually requires a domain-specific corpus for building corresponding concept hierarchy. The domain corpus must have a good coverage of domain knowledge. Wikipedia(Wiki), the world's largest online encyclopaedic knowledge source, is open-content, collaboratively edited, and free of charge. It covers millions of articles and still keeps on expanding continuously. These characteristics make Wiki a good candidate as domain corpus resource in ontology construction. However, the selected article collection must have considerable quality and quantity. In this paper, a novel approach is proposed to identify articles in Wiki as domain-specific corpus by using available classification information in Wiki pages. The main idea is to generate a domain hierarchy from the hyperlinked pages of Wiki. Only articles strongly linked to this hierarchy are selected as the domain corpus. The proposed approach makes use of linked category information in Wiki pages to produce the hi...

Gaoying Cui, Qin Lu, Wenjie Li, Yi-Rong Chen

Real-time Traffic

Domain Corpus | Domain-specific Corpus | Education | LREC 2008 | Ontology Construction |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	LREC
Authors	Gaoying Cui, Qin Lu, Wenjie Li, Yi-Rong Chen

Comments (0)

Sciweavers

Corpus Exploitation from Wikipedia for Ontology Construction

Domain Corpus | Domain-specific Corpus | Education | LREC 2008 | Ontology Construction |

Explore & Download

Productivity Tools

Sciweavers