Domain specific ontologies are heavily used in many applications. For instance, these form the bases on which similarity/dissimilarity between keywords are extracted for various knowledge discovery and retrieval tasks. Existing similarity computation schemes can be categorized as (a) structure- or (b) information-based approaches. Structurebased approaches compute dissimilarity between keywords using a (weighted) count of edges between two keywords. Information-base approaches, on the other hand, leverage available corpora to extract additional information, such as keyword frequency, to achieve better performance in similarity computation than structure-based approaches. Unfortunately, in many application domains (such as applications that rely on unique-keys in a relational database), frequency information required by information-based approaches does not exist. In this paper, we note that there is a third way of computing similarity: if each node in a given hierarchy can be represen...
Jong Wook Kim, K. Selçuk Candan