Inverted index compression, block addressing and sequential search on compressed text are three techniques that have been separately developed for e cient, low-overhead text retrie...
Gonzalo Navarro, Edleno Silva de Moura, Marden S. ...
Domain-specific internet portals are growing in popularity because they gather content from the Web and organize it for easy access, retrieval and search. For example, www.campsear...
Andrew McCallum, Kamal Nigam, Jason Rennie, Kristi...
: Previous research in conceptual indexing methods of images has furnished us with refined theoretical frameworks characterising various aspects of images that could and should be ...
This paper presents Ellogon, a multi-lingual, cross-platform, general-purpose text engineering environment. Ellogon was designed in order to aid both researchers in natural langua...
This paper describes howautomated deduction methods for natural language processing can be applied moreefficiently by encodingcontext in a moreelaborate way. Our workis based on f...
Background: Researchers who use MEDLINE for text mining, information extraction, or natural language processing may benefit from having a copy of MEDLINE that they can manage loca...
Diane E. Oliver, Gaurav Bhalotia, Ariel S. Schwart...
Due to the great variation of biological names in biomedical text, appropriate tokenization is an important preprocessing step for biomedical information retrieval. Despite its im...
Word form normalization through lemmatization or stemming is a standard procedure in information retrieval because morphological variation needs to be accounted for and several la...
In information retrieval, the cluster hypothesis states: closely related documents tend to be relevant to the same request. We exploit this hypothesis directly by adjusting queryb...