Sciweavers

ACL
2015

Model-based Word Embeddings from Decompositions of Count Matrices

8 years 6 months ago
Model-based Word Embeddings from Decompositions of Count Matrices
This work develops a new statistical understanding of word embeddings induced from transformed count data. Using the class of hidden Markov models (HMMs) underlying Brown clustering as a generative model, we demonstrate how canonical correlation analysis (CCA) and certain count transformations permit efficient and effective recovery of model parameters with lexical semantics. We further show in experiments that these techniques empirically outperform existing spectral methods on word similarity and analogy tasks, and are also competitive with other popular methods such as WORD2VEC and GLOVE.
Karl Stratos, Michael Collins, Daniel Hsu
Added 13 Apr 2016
Updated 13 Apr 2016
Type Journal
Year 2015
Where ACL
Authors Karl Stratos, Michael Collins, Daniel Hsu
Comments (0)