Speech Dasher allows writing using a combination of speech and a zooming interface. Users first speak what they want to write and then they navigate through the space of recognition hypotheses to correct any errors. Speech Dasher’s model combines information from a speech recognizer, from the user, and from a letter-based language model. This allows fast writing of anything predicted by the recognizer while also providing seamless fallback to letter-by-letter spelling for words not in the recognizer’s predictions. In a formative user study, expert users wrote at 40 (corrected) words per minute. They did this despite a recognition word error rate of 22%. Furthermore, they did this using only speech and the direction of their gaze (obtained via an eye tracker). Author Keywords speech recognition, eye tracking, multimodal interfaces ACM Classification Keywords H.5.2 User Interfaces: Voice I/O General Terms Experimentation, Human Factors, Performance
Keith Vertanen, David J. C. MacKay