Sciweavers

EMNLP
2010

Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation

13 years 10 months ago
Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation
In this paper, we develop multilingual supervised latent Dirichlet allocation (MLSLDA), a probabilistic generative model that allows insights gleaned from one language's data to inform how the model captures properties of other languages. MLSLDA accomplishes this by jointly modeling two aspects of text: how multilingual concepts are clustered into thematically coherent topics and how topics associated with text connect to an observed regression variable (such as ratings on a sentiment scale). Concepts are represented in a general hierarchical framework that is flexible enough to express semantic ontologies, dictionaries, clustering constraints, and, as a special, degenerate case, conventional topic models. Both the topics and the regression are discovered via posterior inference from corpora. We show MLSLDA can build topics that are consistent across languages, discover sensible bilingual lexical correspondences, and leverage multilingual corpora to better predict sentiment. Sent...
Jordan L. Boyd-Graber, Philip Resnik
Added 11 Feb 2011
Updated 11 Feb 2011
Type Journal
Year 2010
Where EMNLP
Authors Jordan L. Boyd-Graber, Philip Resnik
Comments (0)