We explore the utility of different types of topic models for retrieval purposes. Based on prior work, we describe several ways that topic models can be integrated into the retrie...
The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...
For discrete co-occurrence data like documents and words, calculating optimal projections and clustering are two different but related tasks. The goal of projection is to find a ...
Shipeng Yu, Kai Yu, Volker Tresp, Hans-Peter Krieg...
Models of bags of words typically assume topic mixing so that the words in a single bag come from a limited number of topics. We show here that many sets of bag of words exhibit a...
The Web offers rich relational data with different semantics. In this paper, we address the problem of document recommendation in a digital library, where the documents in questio...
Ding Zhou, Shenghuo Zhu, Kai Yu, Xiaodan Song, Bel...