The Lexical Access Problem consists of determining the intended sequence of words corresponding to an input sequence of phonemes (basic speech sounds) that come from a low-level phoneme recognizer. In this paper we present an information-theoretic approach based on the Minimum Message Length Criterion for solving the Lexical Access Problem. We model sentences using phoneme realizations seen in training, and word and part-of-speech information obtained from text corpora. We show results on multiple-speaker, continuous, read speech and discuss a heuristic using equivalence classes of similar sounding words which speeds up the recognition process without significant deterioration in recognition accuracy.
Ian E. Thomas, Ingrid Zukerman, Jonathan J. Oliver