In this manuscript we present the summarization and categorization subsystems of a complete mechanism that begins with web-page fetching and concludes with representation of the c...
The subject of collective attention is central to an information age where millions of people are inundated with daily messages. It is thus of interest to understand how attention...
We investigate a representative case of sudden information need change of Web users. By analyzing search engine query logs, we show that the majority of queries submitted by users...
This paper proposes an unsupervised learning model for classifying named entities. This model uses a training set, built automatically by means of a small-scale named entity dicti...
The rate of occurrence of words is not uniform but varies from document to document. Despite this observation, parameters for conventional n-gram language models are usually deriv...