This paper explores the possibility to exploit text on the world wide web in order to enrich the concepts in existing ontologies. First, a method to retrieve documents from the WWW...
Eneko Agirre, Olatz Ansa, Eduard H. Hovy, David Ma...
A major challenge in document clustering is the extremely high dimensionality. For example, the vocabulary for a document set can easily be thousands of words. On the other hand, ...
Clustering is a branch of multivariate analysis that is used to create groups of data. While there are currently a variety of techniques that are used for creating clusters, many ...
Javier Bajo, Juan Francisco de Paz, Sara Rodr&iacu...
DIVCLUS-T is a divisive hierarchical clustering algorithm based on a monothetic bipartitional approach allowing the dendrogram of the hierarchy to be read as a decision tree. It i...
We built a system for the automatic creation of a textbased topic hierarchy, meant to be used in a geographically defined community. This poses two main problems. First, the appea...