In this paper we propose a new information-theoretic divisive algorithm for word clustering applied to text classification. In previous work, such "distributional clustering&...
Inderjit S. Dhillon, Subramanyam Mallela, Rahul Ku...
Diminishing awareness is a consequence of the information explosion: disciplines are becoming increasingly specialized; individuals and groups are becoming ever more insular. This...
Peter Bruza, Richard Cole, Dawei Song, Zeeniya Bar...
As the number of available Web pages grows, users experience increasing difficulty finding documents relevant to their interests. One of the underlying reasons for this is that mo...
Government regulations are semi-structured text documents that are often voluminous, heavily cross-referenced between provisions and even ambiguous. Multiple sources of regulation...
Recent content-based video retrieval systems combine output of concept detectors (also known as high-level features) with text obtained through automatic speech recognition. This ...
Robin Aly, Djoerd Hiemstra, Arjen P. de Vries, Fra...