Sciweavers

IR
2000
13 years 7 months ago
Adding Compression to Block Addressing Inverted Indexes
Inverted index compression, block addressing and sequential search on compressed text are three techniques that have been separately developed for e cient, low-overhead text retrie...
Gonzalo Navarro, Edleno Silva de Moura, Marden S. ...
IR
2000
13 years 7 months ago
Automating the Construction of Internet Portals with Machine Learning
Domain-specific internet portals are growing in popularity because they gather content from the Web and organize it for easy access, retrieval and search. For example, www.campsear...
Andrew McCallum, Kamal Nigam, Jason Rennie, Kristi...
IR
2000
13 years 7 months ago
End-User Searching Challenges Indexing Practices in the Digital Newspaper Photo Archive
: Previous research in conceptual indexing methods of images has furnished us with refined theoretical frameworks characterising various aspects of images that could and should be ...
Marjo Markkula, Eero Sormunen
CORR
2002
Springer
93views Education» more  CORR 2002»
13 years 7 months ago
Ellogon: A New Text Engineering Platform
This paper presents Ellogon, a multi-lingual, cross-platform, general-purpose text engineering environment. Ellogon was designed in order to aid both researchers in natural langua...
Georgios Petasis, Vangelis Karkaletsis, Georgios P...
CORR
2000
Springer
107views Education» more  CORR 2000»
13 years 7 months ago
Computing Presuppositions by Contextual Reasoning
This paper describes howautomated deduction methods for natural language processing can be applied moreefficiently by encodingcontext in a moreelaborate way. Our workis based on f...
Christof Monz
BMCBI
2004
110views more  BMCBI 2004»
13 years 7 months ago
Tools for loading MEDLINE into a local relational database
Background: Researchers who use MEDLINE for text mining, information extraction, or natural language processing may benefit from having a copy of MEDLINE that they can manage loca...
Diane E. Oliver, Gaurav Bhalotia, Ariel S. Schwart...
IR
2007
13 years 7 months ago
An empirical study of tokenization strategies for biomedical information retrieval
Due to the great variation of biological names in biomedical text, appropriate tokenization is an important preprocessing step for biomedical information retrieval. Despite its im...
Jing Jiang, ChengXiang Zhai
IR
2007
13 years 7 months ago
Restricted inflectional form generation in management of morphological keyword variation
Word form normalization through lemmatization or stemming is a standard procedure in information retrieval because morphological variation needs to be accounted for and several la...
Kimmo Kettunen, Eija Airio, Kalervo Järvelin
IR
2007
13 years 7 months ago
Regularizing query-based retrieval scores
In information retrieval, the cluster hypothesis states: closely related documents tend to be relevant to the same request. We exploit this hypothesis directly by adjusting queryb...
Fernando Diaz