We exploit sketch techniques, especially the Count-Min sketch, a memory, and time efficient framework which approximates the frequency of a word pair in the corpus without explic...
Class-instance label propagation algorithms have been successfully used to fuse information from multiple sources in order to enrich a set of unlabeled instances with class labels...
Zornitsa Kozareva, Konstantin Voevodski, Shang-Hua...
We show that categories induced by unsupervised word clustering can surpass the performance of gold part-of-speech tags in dependency grammar induction. Unlike classic clustering ...
Valentin I. Spitkovsky, Hiyan Alshawi, Angel X. Ch...
We present new training methods that aim to mitigate local optima and slow convergence in unsupervised training by using additional imperfect objectives. In its simplest form, lat...
Valentin I. Spitkovsky, Hiyan Alshawi, Daniel Jura...