Massive amounts of useful data are stored and processed in ad hoc formats for which common tools like parsers, printers, query engines and format converters are not readily available. In this paper, we explain the design and implementation of PADS/ML, a new language and system that facilitates the generation of data processing tools for ad hoc formats. The PADS/ML design includes features such as dependent, polymorphic and recursive datatypes, which allow programmers to describe the syntax and semantics of ad hoc data in a concise, easy-to-read notation. The PADS/ML implementation compiles these descriptions into ML structures and functors that include types for parsed data, functions for parsing and printing, and auxiliary support for user-specified, format-dependent and format-independent tool generation. Categories and Subject Descriptors D.3.2 [Language Classifications]: Applicative (functional) languages General Terms Languages Keywords Data description languages, domain-specific...