Large scale digitization projects have been conducted at the Internet Archive digital library to preserve cultural artifacts and to provide permanent access. The increasing amount of digitized resources requires advanced tools and methods that will efficiently analyze and manage digitized resources. In this position paper, we identify several issues related to scanned books projects, present our initial work on automatic metadata generation for scanned scientific journals, and suggest potential future actions. Categories and Subject Descriptors H.3.7 [Information Storage and Retrieval]: Digital Libraries