— Projection methods have been used in the analysis of bi-tonal document images for different tasks like page segmentation and skew correction for over two decades. However, these algorithms are sensitive to the presence of border noise in document images. Border noise can appear along the page border due to scanning or photocopying. Over the years, several page segmentation algorithms have been proposed in the literature. Some of these algorithms have come to wide-spread use due to their high accuracy and robustness with respect to border noise. This paper addresses two important questions in this context: 1) Can existing border noise removal algorithms clean up document images to a degree required by projection methods to achieve competitive performance? 2) Can projection methods reach the performance of other state-of-the-art page segmentation algorithms (e.g. Docstrum or Voronoi) for documents where border noise has successfully been removed? We perform extensive experiments on t...
Faisal Shafait, Thomas M. Breuel