In this paper we present a novel method for removing page rule lines in monochromatic handwritten Arabic documents using subspace methods with minimal effect on the quality of the foreground text. We use moment and histogram properties to extract features that represent the characteristics of the underlying rule lines. A linear subspace is incrementally built to obtain a line model that can be used to identify rule line pixels. We also introduce a novel scheme for evaluating noise removal algorithms in general and we use it to assess the quality of our rule line removal algorithm. Experimental results presented on a data set of 50 Arabic documents, handwritten by different writers, demonstrate the effectiveness of the proposed method.
Wael Abd-Almageed, Jayant Kumar, David S. Doermann