The retrieval of similar documents in the Web from a given document is different in many aspects from information retrieval based on queries generated by regular search engine use...
Felipe Bravo-Marquez, Gaston L'Huillier, Sebasti&a...
Abstract— In this work, web-based metrics for semantic similarity computation between words or terms are presented and compared with the state-of-the-art. Starting from the funda...
Conventional document search techniques are constrained by attempting to match individual keywords or phrases to source documents. Thus, these techniques miss out documents that co...
In this paper, we propose a new similarity measure to compute the pairwise similarity of text-based documents based on suffix tree document model. By applying the new suffix tree ...
As the amount of textual information available through the World Wide Web grows, there is a growing need for high-precision IR systems that enable a user to nd useful information ...
Mandar Mitra, Chris Buckley, Amit Singhal, Claire ...