Sciweavers

FLAIRS
2007

Contextual Concept Discovery Algorithm

14 years 2 months ago
Contextual Concept Discovery Algorithm
In this paper, we focus on the ontological concept extraction and evaluation process from HTML documents. In order to improve this process, we propose an unsupervised hierarchical clustering algorithm namely “Contextual Concept Discovery” (CCD) which is an incremental use of the partitioning algorithm Kmeans and is guided by a structural context. Our context exploits the html structure and the location of words to select the semantically closer cooccurrents for each word and to improve word weighting. Guided by this context definition, we perform an incremental clustering that refines the context of each word clusters to obtain semantically extracted concepts. The CCD algorithm offers the choice between either an automatic execution or a user’s interaction. The last function of the CCD algorithm is to provide a complementary support for an easy evaluation task. This functionality is based on a large collection of web documents and several context definitions deduced from it by a...
Lobna Karoui, Marie-Aude Aufaure, Nacéra Be
Added 02 Oct 2010
Updated 02 Oct 2010
Type Conference
Year 2007
Where FLAIRS
Authors Lobna Karoui, Marie-Aude Aufaure, Nacéra Bennacer
Comments (0)