Many approaches to unsupervised morphology acquisition incorporate the frequency of character sequences with respect to each other to identify word stems and affixes. This typical...
We present a novel approach for multilingual document clustering using only comparable corpora to achieve cross-lingual semantic interoperability. The method models document colle...
This paper presents a system that combines two text mining techniques; information extraction and clustering. A rulebased approach is used to perform the information extraction tas...
A good clustering performance depends on the quality of the distance function used to asses similarity. In this paper we propose a pairwise document coreference model to improve pe...
Iustin Dornescu, Constantin Orasan, Tatiana Lesnik...
Hierarchical categorization of documents is a task receiving growing interest due to the widespread proliferation of topic hierarchies for text documents. The worst problem of hie...