Sciweavers

ECIR
2006
Springer

Using Concept-Based Indexing to Improve Language Modeling Approach to Genomic IR

14 years 1 months ago
Using Concept-Based Indexing to Improve Language Modeling Approach to Genomic IR
Genomic IR, characterized by its highly specific information need, severe synonym and polysemy problem, long term name and rapid growing literature size, is challenging IR community. In this paper, we are focused on addressing the synonym and polysemy issue within the language model framework. Unlike the ways translation model and traditional query expansion techniques approach this issue, we incorporate concept-based indexing into a basic language model for genomic IR. In particular, we adopt UMLS concepts as indexing and searching terms. A UMLS concept stands for a unique meaning in the biomedicine domain; a set of synonymous terms will share same concept ID. Therefore, the new approach makes the document ranking effective while maintaining the simplicity of language models. A comparative experiment on the TREC 2004 Genomics Track data shows significant improvements are obtained by incorporating concept-based indexing into a basic language model. The MAP (mean average precision) is s...
Xiaohua Zhou, Xiaodan Zhang, Xiaohua Hu
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where ECIR
Authors Xiaohua Zhou, Xiaodan Zhang, Xiaohua Hu
Comments (0)