Phrase has been considered as a more informative feature term for improving the effectiveness of document clustering. In this paper, we propose a phrase-based document similarity t...
Document clustering techniques mostly rely on single term analysis of the document data set, such as the Vector Space Model. To better capture the structure of documents, the unde...
— One of the critical issues in search engines is the size of search indexes: as the number of documents handled by an engine increases, the search must preserve its efficiency,...
It is well known that the main objective of conceptual retrieval models is to go beyond simple term matching by relaxing term independence assumption through concept recognition. ...
Structured data including sets, sequences, trees and graphs, pose significant challenges to fundamental aspects of data management such as efficient storage, indexing, and simila...
Xiaohong Wang, Aaron M. Smalter, Jun Huan, Gerald ...