Sciweavers

IMCSIT
2010

Learning taxonomic relations from a set of text documents

13 years 9 months ago
Learning taxonomic relations from a set of text documents
This paper presents a methodology for learning taxonomic relations from a set of documents that each explain one of the concepts. Three different feature extraction approaches with varying degree of language independence are compared in this study. The first feature extraction scheme is a languageindependent approach based on statistical keyphrase extraction, and the second one is based on a combination of rule-based stemming and fuzzy logic-based feature weighting and selection. The third approach is the traditional tf-idf weighting scheme with commonly used rule-based stemming. The concept hierarchy is obtained by combining Self-Organizing Map clustering with agglomerative hierarchical clustering. Experiments are conducted for both English and Finnish. The results show that concept hierarchies can be constructed automatically also by using statistical methods without heavy language-specific preprocessing.
Mari-Sanna Paukkeri, Alberto Pérez Garc&iac
Added 13 Feb 2011
Updated 13 Feb 2011
Type Journal
Year 2010
Where IMCSIT
Authors Mari-Sanna Paukkeri, Alberto Pérez García-Plaza, Sini Pessala, Timo Honkela
Comments (0)