Name ambiguity problem has been a challenging issue for a long history. In this paper, we intend to make a thorough investigation of the whole problem. Specifically, we formalize ...
Large 0-1 datasets arise in various applications, such as market basket analysis and information retrieval. We concentrate on the study of topic models, aiming at results which in...
We propose a Bayesian extension to the ad-hoc Language Model. Many smoothed estimators used for the multinomial query model in ad-hoc Language Models (including Laplace and Bayes-...
Hugo Zaragoza, Djoerd Hiemstra, Michael E. Tipping
We introduce a multi-stage ensemble framework, ErrorDriven Generalist+Expert or Edge, for improved classification on large-scale text categorization problems. Edge first trains a ...
This paper follows a formal approach to information retrieval based on statistical language models. By introducing some simple reformulations of the basic language modeling approa...