Sciweavers

BMCBI
2008

Gene Ontology density estimation and discourse analysis for automatic GeneRiF extraction

14 years 18 days ago
Gene Ontology density estimation and discourse analysis for automatic GeneRiF extraction
Background: This paper describes and evaluates a sentence selection engine that extracts a GeneRiF (Gene Reference into Functions) as defined in ENTREZ-Gene based on a MEDLINE record. Inputs for this task include both a gene and a pointer to a MEDLINE reference. In the suggested approach we merge two independent sentence extraction strategies. The first proposed strategy (LASt) uses argumentative features, inspired by discourse-analysis models. The second extraction scheme (GOEx) uses an automatic text categorizer to estimate the density of Gene Ontology categories in every sentence; thus providing a full ranking of all possible candidate GeneRiFs. A combination of the two approaches is proposed, which also aims at reducing the size of the selected segment by filtering out non-content bearing rhetorical phrases. Results: Based on the TREC-2003 Genomics collection for GeneRiF identification, the LASt extraction strategy is already competitive (52.78%). When used in a combined approach,...
Julien Gobeill, Imad Tbahriti, Frédé
Added 09 Dec 2010
Updated 09 Dec 2010
Type Journal
Year 2008
Where BMCBI
Authors Julien Gobeill, Imad Tbahriti, Frédéric Ehrler, Anaïs Mottaz, Anne-Lise Veuthey, Patrick Ruch
Comments (0)