Translating between dissimilar languages requires an account of the use of divergent word orders when expressing the same semantic content. Reordering poses a serious problem for s...
BLEU is the de facto standard for evaluation and development of statistical machine translation systems. We describe three real-world situations involving comparisons between diff...
David Chiang, Steve DeNeefe, Yee Seng Chan, Hwee T...
Software metrics are an essential means to assess software quality. For the assessment of software quality, typically sets of complementing metrics are used since individual metric...
Edith Werner, Jens Grabowski, Helmut Neukirchen, N...
Recent work in the field of machine translation (MT) evaluation suggests that sentence level evaluation based on machine learning (ML) can outperform the standard metrics such as B...
Antoine Veillard, Elvina Melissa, Cassandra Theodo...
We report the results of an experiment to assess the ability of automated MT evaluation metrics to remain sensitive to variations in MT quality as the average quality of the compa...