This paper looks at a parsing-based alternative to word error rate (WER) for optimizing recognition, SParseval, hypothesizing that it may be a better objective for applications such as translation. We find that SParseval is more correlated than WER with human measures of subsequent translation performance, but that optimizing explicitly for SParseval does not give a significant reduction in translation error as measured by automatic methods based on a single translation reference. However, anecdotal examples indicate that SParseval does improve automatic speech recognition (ASR) results, leaving open the possibility that it may be more useful in the future or for other language processing tasks.
Dustin Hillard, Mei-Yuh Hwang, Mary P. Harper, Mar