Supervised approaches to Word Sense Disambiguation (WSD) have been shown to outperform other approaches but are hampered by reliance on labeled training examples (the data acquisition bottleneck). This paper presents a novel approach to the automatic acquisition of labeled examples for WSD which makes use of the Information Retrieval technique of relevance feedback. This semi-supervised method generates additional labeled examples based on existing annotated data. Our approach is applied to a set of ambiguous terms from biomedical journal articles and found to significantly improve the performance of a state-of-the-art WSD system.
Mark Stevenson, Yikun Guo, Robert J. Gaizauskas