Sciweavers

BMCBI
2008

SNPFile - A software library and file format for large scale association mapping and population genetics studies

13 years 11 months ago
SNPFile - A software library and file format for large scale association mapping and population genetics studies
Background: High-throughput genotyping technology has enabled cost effective typing of thousands of individuals in hundred of thousands of markers for use in genome wide studies. This vast improvement in data acquisition technology makes it an informatics challenge to efficiently store and manipulate the data. While spreadsheets and at text files were adequate solutions earlier, the increased data size mandates more efficient solutions. Results: We describe a new binary file format for SNP data, together with a software library for file manipulation. The file format stores genotype data together with any kind of additional data, using a flexible serialisation mechanism. The format is designed to be IO efficient for the access patterns of most multi-locus analysis methods. Conclusion: The new file format has been very useful for our own studies where it has significantly reduced the informatics burden in keeping track of various secondary data, and where the memory and IO efficiency ha...
Jesper Nielsen, Thomas Mailund
Added 08 Dec 2010
Updated 08 Dec 2010
Type Journal
Year 2008
Where BMCBI
Authors Jesper Nielsen, Thomas Mailund
Comments (0)