Sciweavers

IR
2007
13 years 10 months ago
An empirical study of tokenization strategies for biomedical information retrieval
Due to the great variation of biological names in biomedical text, appropriate tokenization is an important preprocessing step for biomedical information retrieval. Despite its im...
Jing Jiang, ChengXiang Zhai
IR
2007
13 years 10 months ago
Restricted inflectional form generation in management of morphological keyword variation
Word form normalization through lemmatization or stemming is a standard procedure in information retrieval because morphological variation needs to be accounted for and several la...
Kimmo Kettunen, Eija Airio, Kalervo Järvelin
IR
2007
13 years 10 months ago
Regularizing query-based retrieval scores
In information retrieval, the cluster hypothesis states: closely related documents tend to be relevant to the same request. We exploit this hypothesis directly by adjusting queryb...
Fernando Diaz
IR
2007
13 years 10 months ago
Searching strategies for the Bulgarian language
This paper reports on the underlying IR problems encountered when indexing and searching with the Bulgarian language. For this language we propose a general light stemmer and demon...
Jacques Savoy
IR
2007
13 years 10 months ago
Learning-based summarisation of XML documents
Documents formatted in eXtensible Markup Language (XML) are available in collections of various document types. In this paper, we present an approach for the summarisation of XML d...
Massih-Reza Amini, Anastasios Tombros, Nicolas Usu...
IR
2007
13 years 10 months ago
Modeling context through domain ontologies
Traditional information retrieval systems aim at satisfying most users for most of their searches, leaving aside the context in which the search takes place. We propose to model tw...
Nathalie Hernandez, Josiane Mothe, Claude Chrismen...
IR
2007
13 years 10 months ago
Lightweight natural language text compression
Variants of Huffman codes where words are taken as the source symbols are currently the most attractive choices to compress natural language text databases. In particular, Tagged...
Nieves R. Brisaboa, Antonio Fariña, Gonzalo...