Polylingual Topic Models

15 years 4 months ago

Download www.cs.umass.edu

Topic models are a useful tool for analyzing large text collections, but have previously been applied in only monolingual, or at most bilingual, contexts. Meanwhile, massive collections of interlinked documents in dozens of languages, such as Wikipedia, are now widely available, calling for tools that can characterize content in many languages. We introduce a polylingual topic model that discovers topics aligned across multiple languages. We explore the model's characteristics using two large corpora, each with over ten different languages, and demonstrate its usefulness in supporting machine translation and tracking topic trends across languages.

David M. Mimno, Hanna M. Wallach, Jason Naradowsky

Real-time Traffic

EMNLP 2009 | Large Text Collections | Natural Language Processing | Polylingual Topic Model | Topic Model |

claim paper

» Modeling Chinese Documents with Topical WordCharacter Models

» A TwoDimensional TopicAspect Model for Discovering MultiFaceted Topics

» Language Modeling Using PLSABased Topic HMM

» Analyzing Entities and Topics in News Articles Using Statistical Topic Models

» PCFGs Topic Models Adaptor Grammars and Learning Topical Collocations and the Structure of...

» Structural Topic Model for Latent Topical Structure Analysis

» Context Management with Topics for Spoken Dialogue Systems

» Best Topic Word Selection for Topic Labelling

Post Info
More Details (n/a)

Added	17 Feb 2011
Updated	17 Feb 2011
Type	Journal
Year	2009
Where	EMNLP
Authors	David M. Mimno, Hanna M. Wallach, Jason Naradowsky, David A. Smith, Andrew McCallum

Comments (0)

Sciweavers

Polylingual Topic Models

EMNLP 2009 | Large Text Collections | Natural Language Processing | Polylingual Topic Model | Topic Model |

Explore & Download

Productivity Tools

Sciweavers