Sciweavers

ACL
2015

Compact Lexicon Selection with Spectral Methods

8 years 7 months ago
Compact Lexicon Selection with Spectral Methods
In this paper, we introduce the task of selecting compact lexicon from large, noisy gazetteers. This scenario arises often in practice, in particular spoken language understanding (SLU). We propose a simple and effective solution based on matrix decomposition techniques: canonical correlation analysis (CCA) and rank-revealing QR (RRQR) factorization. CCA is first used to derive low-dimensional gazetteer embeddings from domain-specific search logs. Then RRQR is used to find a subset of these embeddings whose span approximates the entire lexicon space. Experiments on slot tagging show that our method yields a small set of lexicon entities with average relative error reduction of > 50% over randomly selected lexicon.
Young-Bum Kim, Karl Stratos, Xiaohu Liu, Ruhi Sari
Added 13 Apr 2016
Updated 13 Apr 2016
Type Journal
Year 2015
Where ACL
Authors Young-Bum Kim, Karl Stratos, Xiaohu Liu, Ruhi Sarikaya
Comments (0)