In spontaneous speech, speakers segment their speech into intonational phrases, and make repairs to what they are saying. However, techniques for understanding spontaneous speech tend to treat these events as noise, in the same manner as they handle out-of-grammar constructions and misrecognitions. In our approach, we advocate that these events should be explicitly modeled. We modify the speech recognition process so that it not only models determines the words that the user is saying, but also models intonational phrasing and speech repairs. This not only improves speech recognition performance but also results in a much richer output from the recognizer, with speech repairs resolved and intonational phrase boundaries identified.
Peter A. Heeman