Similarity measures are mechanisms that assign a numeric score indicating how closely two documents, or a document and a query match. The Cosine measure is one of the similarity measures that treat a document or a query as a vector of weighted terms or keywords. The similarity distance calculated by the Cosine measure is based on the exact matching of keywords. Thus the semantic relatedness between the keywords of the two documents is not considered. This paper presents a Category-based Similarity Algorithm (CSA) to determine the semantic similarity between any two pieces of information. CSA is implemented inside the ACORN (Agent-based Community Oriented Routing Network) system, which is a multi-agent system for information retrieval and provision in a community of users. CSA adds the semantic similarity feature to ACORN. It can also be used in any information sharing system in which the information content is represented as vectors of weighted keywords.
Sepideh Miralaei, Ali A. Ghorbani