In most IR clustering problems, we directly cluster the documents, working in the document space, using cosine similarity between documents as the similarity measure. In many real...
It has been widely observed that different NLP applications require different sense granularities in order to best exploit word sense distinctions, and that for many applications ...
Rion Snow, Sushant Prakash, Daniel Jurafsky, Andre...
Each clustering algorithm induces a similarity between given data points, according to the underlying clustering criteria. Given the large number of available clustering technique...
We propose a new method for handwritten word-spotting which does not require prior training or gathering examples for querying. More precisely, a model is trained "on the fly...
We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is...