Unsupervised Semantic Similarity Computation between Terms Using Web Documents

14 years 3 months ago

Download www.telecom.tuc.gr

Abstract— In this work, web-based metrics for semantic similarity computation between words or terms are presented and compared with the state-of-the-art. Starting from the fundamental assumption that similarity of context implies similarity of meaning, context-based metrics use a web search engine to download relevant documents and then exploit the retrieved contextual information for the words of interest. The proposed algorithms can be generalized and applied to other languages, work automatically and do not require any human annotated knowledge resources, e.g., ontologies. Context-based metrics are evaluated on the Charles-Miller dataset and on a medical term dataset. It is shown that the context-based similarity metrics signiﬁcantly outperform co-occurrence based metrics, in terms of correlation with human judgment, for both tasks. In addition, the proposed context-based algorithms are shown to be competitive with stateof-the-art supervised semantic similarity metrics that emp...

Elias Iosif, Alexandros Potamianos

Real-time Traffic