— Clustering is grouping of patterns according to similarity or distance in different perspectives. Various data representations, similarity measurements and organization manners...
This paper presents a language identification technique that detects Latin-based languages of imaged documents without OCR. The proposed technique detects languages through the wo...
We present a novel passage-based approach to re-ranking documents in an initially retrieved list so as to improve precision at top ranks. While most work on passage-based document...
This paper describes features and methods for document image comparison and classification at the spatial layout level. The methods are useful for visual similarity based document...
Jianying Hu, Ramanujan S. Kashi, Gordon T. Wilfong
The goal of the paper is to present a novel Chi-square similarity measure and assess its performance through comparison with well-known similarity measures such as Cosine, Dice, a...
Oktay Ibrahimov, Ishwar K. Sethi, Nevenka Dimitrov...