We describe a new approach for evaluating page segmentation algorithms. Unlike techniques that rely on OCR output, our method is region-based: the segmentation output, described a...
This paper presents a dynamic approach to document page segmentation. Current page segmentation algorithms lack the ability to dynamically adapt local variations in the size, orie...
This work aims to provide a page segmentation algorithm which uses both visual and content information to extract the semantic structure of a web page. The visual information is u...
Most page segmentation algorithms have userspecifiable free parameters. However, algorithm designers typically do not provide a quantitative/rigorous method for choosing values fo...
This paper presents a new representation and evaluation procedure of page segmentation algorithms and analyzes six widely-used layout analysis algorithms using the procedure. The ...