Abstract. This paper describes the challenges for document image analysis community for building large digital libraries with diverse document categories.Thechallengesareidentified fromtheexperienceof theon-going activities toward digitizing and archiving one million books. Smooth workflow has been established for archiving large quantity of books, with the help of efficient image processing algorithms. However, much more research is needed to address the challenges arising out of the diversity of the content in digital libraries.
K. Pramod Sankar, Vamshi Ambati, Lakshmi Pratha, C