Sciweavers

TCBB
2008

2SNP: Scalable Phasing Method for Trios and Unrelated Individuals

13 years 11 months ago
2SNP: Scalable Phasing Method for Trios and Unrelated Individuals
Emerging microarray technologies allow affordable typing of very long genome sequences. A key challenge in analyzing of such huge amount of data is scalable and accurate computational inferring of haplotypes (i.e., splitting of each genotype into a pair of corresponding haplotypes). In this paper, we first phase genotypes consisting only of two SNPs using genotypes frequencies adjusted to the random mating model and then extend phasing of two-SNP genotypes to phasing of complete genotypes using maximum spanning trees. Runtime of the proposed 2SNP algorithm is O(nm(n + log m), where n and m are the numbers of genotypes and SNPs, respectively, and it can handle genotypes spanning entire chromosomes in a matter of hours. On datasets across 23 chromosomal regions from HapMap[11], 2SNP is several orders of magnitude faster than GERBIL and PHASE while matching them in quality measured by the number of correctly phased genotypes, single-site and switching errors. For example the 2SNP software...
Dumitru Brinza, Alexander Zelikovsky
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where TCBB
Authors Dumitru Brinza, Alexander Zelikovsky
Comments (0)