In a recent paper (Preparata et al., 1999) we introduced a novel probing scheme for DNA sequencing by hybridization (SBH). The new gapped-probe scheme combines natural and universal bases in a well-de ned periodic pattern. It has been shown (Preparata et al., 1999) that the performance of the gapped-probe scheme (in terms of the length of a sequence that can be uniquely reconstructed using a given size library of probes) is signi cantly better than the standard scheme based on oligomer probes. In this paper we present and analyze a new, more powerful, sequencing algorithm for the gapped-probe scheme. We prove that the new algorithm exploits the full potential of the SBH technology with high-con dence performance that comes within a small constant factor (about 2) of the information-theory bound. Moreover, this performance is achieved while maintaining running time linear in the target sequence length. Key words: DNA sequencing, sequencing by hybridization, gapped probes, probabilistic...
Franco P. Preparata, Eli Upfal