Stop word detection is attempted in this work in the context of retrieval of document images in the compressed domain. Algorithms are presented to identify text lines and words an...
In a federated digital library system, it is too expensive to query every accessible library. Resource selection is the task to decide to which libraries a query should be routed. ...
ABSTRACT: OCR is an error-prone process. It is time-consuming and expensive to manually proofread OCR results. The errors remaining in OCRed texts can cause serious problems in rea...
The detection of image versions from large image collections is a formidable task as two images are rarely identical. Geometric variations such as cropping, rotation, and slight p...
We propose a novel technique for semi-supervised image annotation which introduces a harmonic regularizer based on the graph Laplacian of the data into the probabilistic semantic ...
Yuanlong Shao, Yuan Zhou, Xiaofei He, Deng Cai, Hu...