The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...
Considering the statistical text classification problem we approximate class-conditional probability distributions by structurally modified Poisson mixtures. By introducing the st...
This paper reports the participation of the University of Lisbon at the 2007 GeoCLEF task. We adopted a novel approach for GIR, focused on handling geographic features and feature ...
Nuno Cardoso, David Cruz, Marcirio Silveira Chaves...
Abstract. In this paper, we describe a new unsupervised sentence boundary detection system and present a comparative study evaluating its performance against different systems foun...
Jan Strunk, Carlos Nascimento Silla Jr., Celso A. ...
A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...
Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...