Identifying overrepresented concepts in gene lists from literature: a statistical approach based on Poisson mixture model

15 years 6 months ago

Download www.biomedcentral.com

Background: Large-scale genomic studies often identify large gene lists, for example, the genes sharing the same expression patterns. The interpretation of these gene lists is generally achieved by extracting concepts overrepresented in the gene lists. This analysis often depends on manual annotation of genes based on controlled vocabularies, in particular, Gene Ontology (GO). However, the annotation of genes is a labor-intensive process; and the vocabularies are generally incomplete, leaving some important biological domains inadequately covered. Results: We propose a statistical method that uses the primary literature, i.e. free-text, as the source to perform overrepresentation analysis. The method is based on a statistical framework of mixture model and addresses the methodological flaws in several existing programs. We implemented this method within a literature mining system, BeeSpace, taking advantage of its analysis environment and added features that facilitate the interactive...

Xin He, Moushumi Sen Sarma, Xu Ling, Brant W. Chee

Real-time Traffic

Analysis | BMCBI 2010 | Gene | Gene Lists |

claim paper

Added	08 Dec 2010
Updated	08 Dec 2010
Type	Journal
Year	2010
Where	BMCBI
Authors	Xin He, Moushumi Sen Sarma, Xu Ling, Brant W. Chee, Chengxiang Zhai, Bruce R. Schatz

Sciweavers

Identifying overrepresented concepts in gene lists from literature: a statistical approach based on Poisson mixture model

Analysis | BMCBI 2010 | Gene | Gene Lists |

Explore & Download

Productivity Tools

Sciweavers