It is shown that structural similarity between proteins can be decided well with much less information than what is used in common similarity measures. The full C representation c...
The application of document clustering to information retrieval has been motivated by the potential effectiveness gains postulated by the Cluster Hypothesis. The hypothesis states ...
An important problem in principal component analysis (PCA) is the estimation of the correct number of components to retain. PCA is most often used to reduce a set of observed vari...
Current Web search tools do a good job of retrieving documents that satisfy the wide range of intentions that people associate with a query – but do not do a very good job of di...
This paper presents the results of using Roget's International Thesaurus as the taxonomy in a semantic similarity measurement task. Four similarity metrics were taken from th...