In this paper, we focus on the ontological concept extraction and evaluation process from HTML documents. In order to improve this process, we propose an unsupervised hierarchical clustering algorithm namely “Contextual Concept Discovery” (CCD) which is an incremental use of the partitioning algorithm Kmeans and is guided by a structural context. Our context exploits the html structure and the location of words to select the semantically closer cooccurrents for each word and to improve word weighting. Guided by this context definition, we perform an incremental clustering that refines the context of each word clusters to obtain semantically extracted concepts. The CCD algorithm offers the choice between either an automatic execution or a user’s interaction. The last function of the CCD algorithm is to provide a complementary support for an easy evaluation task. This functionality is based on a large collection of web documents and several context definitions deduced from it by a...