Despite the rising interest in developing grammatical error detection systems for non-native speakers of English, progress in the field has been hampered by a lack of informative...
Nitin Madnani, Martin Chodorow, Joel R. Tetreault,...
In crowdsourced relevance judging, each crowd worker typically judges only a small number of examples, yielding a sparse and imbalanced set of judgments in which relatively few wo...
Crowdsourcing platforms offer unprecedented opportunities for creating evaluation benchmarks, but suffer from varied output quality from crowd workers who possess different levels...
Understanding perception is critical to effective visualization design. With its low cost and scalability, crowdsourcing presents an attractive option for evaluating the large des...
MOS (mean opinion score) subjective quality studies are used to evaluate many signal processing methods. Since laboratory quality studies are time consuming and expensive, researc...