Sciweavers

BMCBI
2016

MetaCRAM: an integrated pipeline for metagenomic taxonomy identification and compression

8 years 8 months ago
MetaCRAM: an integrated pipeline for metagenomic taxonomy identification and compression
Background: Metagenomics is a genomics research discipline devoted to the study of microbial communities in environmental samples and human and animal organs and tissues. Sequenced metagenomic samples usually comprise reads from a large number of different bacterial communities and hence tend to result in large file sizes, typically ranging between 1–10 GB. This leads to challenges in analyzing, transferring and storing metagenomic data. In order to overcome these data processing issues, we introduce MetaCRAM, the first de novo, parallelized software suite specialized for FASTA and FASTQ format metagenomic read processing and lossless compression. Results: MetaCRAM integrates algorithms for taxonomy identification and assembly, and introduces parallel execution methods; furthermore, it enables genome reference selection and CRAM based compression. MetaCRAM also uses novel reference-based compression methods designed through extensive studies of integer compression techniques and thr...
MinJi Kim, Xiejia Zhang, Jonathan G. Ligo, Farzad
Added 30 Mar 2016
Updated 30 Mar 2016
Type Journal
Year 2016
Where BMCBI
Authors MinJi Kim, Xiejia Zhang, Jonathan G. Ligo, Farzad Farnoud, Venugopal V. Veeravalli, Olgica Milenkovic
Comments (0)