Using current technology, large consecutive stretches of DNA (such as whole chromosomes) are usually assembled from short fragments obtained by shotgun sequencing, or from fragments and mate-pairs, if a “double-barreled” shotgun strategy is employed. The positioning of the fragments (and mate-pairs, if available) in an assembled sequence can be used to evaluate the quality of the assembly and also to compare two different assemblies of the same chromosome, even if they are obtained from two different sequencing projects. This paper describes some simple and fast methods of this type that were developed to evaluate and compare different assemblies of the human genome. Additional applications are in “feature-tracking” from one version of an assembly to the next, comparisons of different chromosomes within the same genome and comparisons between similar chromosomes from different species.
Daniel H. Huson, Aaron L. Halpern, Zhongwu Lai, Eu