E-commerce and knowledge management applications generate and consume tremendous amounts of online information that is typically available as textual documents. To facilitate subs...
Document clustering has long been an important problem in information retrieval. In this paper, we present a new clustering algorithm ASI1, which uses explicitly modeling of the s...
Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular...
In this paper, we propose the combination of the Self Organizing Map (SOM) and of the tangent distance for effective clustering in Document Image Analysis. The proposed model (SOM...
This paper presents a new approach designed to reduce the computational load of the existing clustering algorithms by trimming down the documents size using fingerprinting methods...