Speeding disease gene discovery by sequence based candidate prioritization

15 years 7 months ago

Download www.biomedcentral.com

Background: Regions of interest identified through genetic linkage studies regularly exceed 30 centimorgans in size and can contain hundreds of genes. Traditionally this number is reduced by matching functional annotation to knowledge of the disease or phenotype in question. However, here we show that disease genes share patterns of sequence-based features that can provide a good basis for automatic prioritization of candidates by machine learning. Results: We examined a variety of sequence-based features and found that for many of them there are significant differences between the sets of genes known to be involved in human hereditary disease and those not known to be involved in disease. We have created an automatic classifier called PROSPECTR based on those features using the alternating decision tree algorithm which ranks genes in the order of likelihood of involvement in disease. On average, PROSPECTR enriches lists for disease genes two-fold 77% of the time, five-fold 37% of the...

Euan A. Adie, Richard R. Adams, Kathryn L. Evans,

Real-time Traffic

BMCBI 2005 | Disease | Disease Genes | Sequence-based Features |

claim paper

Post Info
More Details (n/a)

Added	15 Dec 2010
Updated	15 Dec 2010
Type	Journal
Year	2005
Where	BMCBI
Authors	Euan A. Adie, Richard R. Adams, Kathryn L. Evans, David J. Porteous, Ben S. Pickard

Comments (0)

Sciweavers

Speeding disease gene discovery by sequence based candidate prioritization

BMCBI 2005 | Disease | Disease Genes | Sequence-based Features |

Explore & Download

Productivity Tools

Sciweavers