Large-scale digitization projects aimed at periodicals often have as input streams of completely unlabeled document images. In such situations, the results produced by the automat...
Iuliu Vasile Konya, Christoph Seibert, Sebastian G...
Abstract. Topic models are a discrete analogue to principle component analysis and independent component analysis that model topic at the word level within a document. They have ma...
This paper presents a novel approach for skew correction of documents. Skew correction is modelled as an optimization problem, and for the first time, Particle Swarm Optimization...
Abstract. We are aiming at extending the basic digital camera functionalities to the ability to simulate the flattening of a document, by virtually acting like a flatbed scanner....
Although detecting text lines in machine printed documents is typically considered a solved problem, it is still a challenge to segment handwritten text lines in the general sense...