Sciweavers

ICIP
2003
IEEE

Stochastic attributed K-d tree modeling of technical paper title pages

15 years 2 months ago
Stochastic attributed K-d tree modeling of technical paper title pages
Structural information about a document is essential for structured query processing, indexing, and retrieval. A document page can be partitioned into a hierarchy of homogeneous regions such as columns, paragraphs, etc.; these regions are called physical components, and define the physical layout of the page. In this paper we develop a class of models for the physical layouts of technical paper title pages. We model physical layout using hidden semi-Markov models for directional projections of page regions, and a stochastic attributed K-d tree grammar model for the 2D hierarchical structure of these regions. We use the models to generate sets of synthetic title page images of three distinctive styles, which we use in controlled experiments on page structure analysis.
Song Mao, Azriel Rosenfeld, Tapas Kanungo
Added 24 Oct 2009
Updated 27 Oct 2009
Type Conference
Year 2003
Where ICIP
Authors Song Mao, Azriel Rosenfeld, Tapas Kanungo
Comments (0)