Automatic Evaluation of Topic Coherence

14 years 24 days ago

Download www.ics.uci.edu

This paper introduces the novel task of topic coherence evaluation, whereby a set of words, as generated by a topic model, is rated for coherence or interpretability. We apply a range of topic scoring models to the evaluation task, drawing on WordNet, Wikipedia and the Google search engine, and existing research on lexical similarity/relatedness. In comparison with human scores for a set of learned topics over two distinct datasets, we show a simple cooccurrence measure based on pointwise mutual information over Wikipedia data is able to achieve results for the task at or nearing the level of inter-annotator correlation, and that other Wikipedia-based lexical relatedness methods also achieve strong results. Google produces strong, if less consistent, results, while our results over WordNet are patchy at best.

David Newman, Jey Han Lau, Karl Grieser, Timothy B

Real-time Traffic

Computational Linguistics | Google Search Engine | Lexical Relatedness Methods | NAACL 2010 | Topic Coherence Evaluation |

claim paper

Post Info
More Details (n/a)

Added	14 Feb 2011
Updated	14 Feb 2011
Type	Journal
Year	2010
Where	NAACL
Authors	David Newman, Jey Han Lau, Karl Grieser, Timothy Baldwin

Comments (0)

Sciweavers

Automatic Evaluation of Topic Coherence

Computational Linguistics | Google Search Engine | Lexical Relatedness Methods | NAACL 2010 | Topic Coherence Evaluation |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers