This paper presents the results of an experiment usiug machine-readable dictionaries (Mill)s) and corpora for building concatenativc units for text to speech (T'PS) systems. Theoretical questions concerning the nature of t)honemic data in dictionaries are raised; phonemic dictionary data is viewed as a representative corpus over which to extract n- gram phonemic frequencies in the language. Dictionary data are compared to corpus data, and phoneme inventories arc evaluated for coverage. A methodology is defined to compute I)honemic n-grams for incorporation into a TTS system.