In Latent Semantic Indexing (LSI), a collection of documents is often pre-processed to form a sparse term-document matrix, followed by a computation of a low-rank approximation to...
In this paper, we develop a novel online algorithm based on the Sequential Monte Carlo (SMC) samplers framework for posterior inference in Dirichlet Process Mixtures (DPM) (DelMor...
We describe an approach for multi-modal characterization of social media by combining text features (e.g. tags as a prominent example of short, unstructured text labels) with spat...
Abstract. The paper considers the problem of semi-supervised multiview classification, where each view corresponds to a Reproducing Kernel Hilbert Space. An algorithm based on co-...
The application of statistical methods to natural language processing has been remarkably successful over the past two decades. But, to deal with recent problems arising in this ...