We present PEM, the first fully automatic metric to evaluate the quality of paraphrases, and consequently, that of paraphrase generation systems. Our metric is based on three crit...
—When comparing clustering results, any evaluation metric breaks down the available information to a single number. However, a lot of evaluation metrics are around, that are not ...
Elke Achtert, Sascha Goldhofer, Hans-Peter Kriegel...
Many machine translation (MT) evaluation metrics have been shown to correlate better with human judgment than BLEU. In principle, tuning on these metrics should yield better syste...
Commonly used coreference resolution evaluation metrics can only be applied to key mentions, i.e. already annotated mentions. We here propose two variants of the B3 and CEAF coref...
—It is common for large and complex organizations to maintain repositories of business process models in order to document and to continuously improve their operations. Given suc...
Remco M. Dijkman, Marlon Dumas, Boudewijn F. van D...