This paper presents a large-scale system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection and ...
Naïve Bayes (NB) classifier has long been considered a core methodology in text classification mainly due to its simplicity and computational efficiency. There is an increasing n...
Statistical language models can learn relationships between topics discussed in a document collection and persons, organizations and places mentioned in each document. We present a...
David Newman, Chaitanya Chemudugunta, Padhraic Smy...
Many applications in text processing require significant human effort for either labeling large document collections (when learning statistical models) or extrapolating rules from...
We study the problem of maintaining large replicated collections of files or documents in a distributed environment with limited bandwidth. This problem arises in a number of impo...