In this paper we investigate how to automatically determine if two document collections are written from different perspectives. By perspectives we mean a point of view, for examp...
We describe iNeATS – an interactive multi-document summarization system that integrates a state-of-the-art summarization engine with an advanced user interface. Three main goals...
In this article, we propose a segmentation-driven recognition system which aims at extracting numerical fields from handwritten documents. We show that a crucial point of the syst...
In this paper, we propose a machine learning approach to title extraction from general documents. By general documents, we mean documents that can belong to any one of a number of...
Yunhua Hu, Hang Li, Yunbo Cao, Dmitriy Meyerzon, Q...
Document registration is a problem where the image of a template document whose layout is known is registered with a test document image. Given the registration parameters, layout...