Model-based Word Embeddings from Decompositions of Count Matrices

9 years 10 months ago

Download www.cs.columbia.edu

This work develops a new statistical understanding of word embeddings induced from transformed count data. Using the class of hidden Markov models (HMMs) underlying Brown clustering as a generative model, we demonstrate how canonical correlation analysis (CCA) and certain count transformations permit efﬁcient and effective recovery of model parameters with lexical semantics. We further show in experiments that these techniques empirically outperform existing spectral methods on word similarity and analogy tasks, and are also competitive with other popular methods such as WORD2VEC and GLOVE.

Karl Stratos, Michael Collins, Daniel Hsu

Real-time Traffic

ACL 2015 | Computational Linguistics |

claim paper

Post Info
More Details (n/a)

Added	13 Apr 2016
Updated	13 Apr 2016
Type	Journal
Year	2015
Where	ACL
Authors	Karl Stratos, Michael Collins, Daniel Hsu

Comments (0)

Sciweavers

Model-based Word Embeddings from Decompositions of Count Matrices

ACL 2015 | Computational Linguistics |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers