This paper is concerned with classifying high dimensional data into one of two categories. In various settings, such as when dealing with fMRI and microarray data, the number of v...
We study a number of natural language decipherment problems using unsupervised learning. These include letter substitution ciphers, character code conversion, phonetic deciphermen...
Kevin Knight, Anish Nair, Nishit Rathod, Kenji Yam...
This paper introduces a novel statistical mixture model for probabilistic clustering of histogram data and, more generally, for the analysis of discrete co occurrence data. Adoptin...
Unlike simple questions, complex questions cannot be answered by simply extracting named entities. These questions require inferencing and synthesizing information from multiple d...
One of the problems in part-of-speech tagging of real-word texts is that of unknown to the lexicon words. In (Mikheev, 1996), a technique for fully unsupervised statistical acquis...