Previously topic models such as PLSI (Probabilistic Latent Semantic Indexing) and LDA (Latent Dirichlet Allocation) were developed for modeling the contents of plain texts. Recent...
In recent years, Latent Semantic Indexing (LSI) has been recognized as an effective tool for Information Retrieval in text documents. The level of "granularity" in LSI (...
This paper proposes two methods of query expansion for retrieving paraphrase candidates indexed by Kanzi (Chinese) characters. The idea is to calculate similarity between Kanzi ch...
Two approaches for integrating images into the framework of a database management system are presented. The classi cation approach preprocesses all images and attaches a semantic ...
We develop a new algorithm for clustering search results. Differently from many other clustering systems that have been recently proposed as a post-processing step for Web search ...