Sciweavers

122

INLG
2010
Springer

153views Natural Language Processing» more INLG 2010»

Comparing Rating Scales and Preference Judgements in Language Evaluation

15 years 3 months ago

Rating-scale evaluations are common in NLP, but are problematic for a range of reasons, e.g. they can be unintuitive for evaluators, inter-evaluator agreement and self-consistency...

Anja Belz, Eric Kow

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers