Sciweavers

TREC
2003

Identifying Gene Function Descriptions by Probability-based Sentence Selection

14 years 1 months ago
Identifying Gene Function Descriptions by Probability-based Sentence Selection
This paper proposes an approach to the secondary task in the TREC Genomics Track. We regard the task as identification of the sentences describing gene functions (i.e., GeneRIFs) and propose a method considering two factors: topicality and relevance. The former refers to the topicality of a sentence and is measured based on location information and word frequencies in the article. The latter refers to the relevance as a GeneRIF based on the vocabulary used in the article. We formalize a probabilistic model combining these features. Our method is evaluated on the test set of 139 MEDLINE abstracts, and the results demonstrate that (a) function words in input could help to identify gene function descriptions and that (b) there is a vocabulary peculiar to GeneRIFs and that (c) location information shows the highest predictive power for this particular task despite its simplicity. Additionally, we examine some alternative methods in comparison with our method.
Kazuhiro Seki, Nihar Sheth, Javed Mostafa
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2003
Where TREC
Authors Kazuhiro Seki, Nihar Sheth, Javed Mostafa
Comments (0)