Although clustering under constraints is a current research topic, a hierarchical setting, in which a hierarchy of clusters is the goal, is usually not considered. This paper trie...
Text clustering is most commonly treated as a fully automated task without user supervision. However, we can improve clustering performance using supervision in the form of pairwi...
— Often document dissemination is limited to a “need to know” basis so as to better maintain organizational trade secrets. Retrieving documents that are off-topic to a user...
Clustering by document concepts is a powerful way of retrieving information from a large number of documents. This task in general does not make any assumption on the data distrib...
: BigBatch is an image processing environment designed to process batches of thousands of monochromatic documents. One of the flexibilities and pioneer aspects of BigBatch is offer...