Ontology construction requires an understanding of the meaning and usage of its encoded concepts. While definitions found in dictionaries or glossaries may be adequate for many concepts, the actual usage in expert writing could be a better source of information for many others. The goal of this paper is to describe an automated procedure for finding definitional content in expert writing. The approach uses machine learning on phrasal features to learn when sentences in a book contain definitional content, as determined by their similarity to glossary definitions provided in the same book. The end result is not a concise definition of a given concept, but for each sentence, a predicted probability that it contains information relevant to a definition. The approach is evaluated automatically for terms with explicit definitions, and manually for terms with no available definition.
Lawrence H. Smith, W. John Wilbur