We present an EM-based clustering method that can be used for constructing or augmenting ontologies such as MeSH. Our algorithm simultaneously clusters verbs and nouns using both verb-noun and noun-noun co-occurrence pairs. This strategy provides greater coverage of words than using either set of pairs alone, since not all words appear in both datasets. We demonstrate it on data extracted from Medline and evaluate the results using MeSH and Wordnet.
Vasileios Kandylas, Lyle H. Ungar, Ted Sandler, Sh