A Randomized Numerical Aligner (rNA)

15 years 7 months ago

Download sole.dimi.uniud.it

With the advent of new sequencing technologies able to produce an enormous quantity of short genomic sequences, new tools able to search for them inside a references sequence genome have emerged. Because of chemical reading errors or of the variability between organisms, one is interested in ﬁnding not only exact occurrences, but also occurrences with up to k mismatches. The contribution of this paper is twofold. On one hand, we present a generalization of the classical Rabin-Karp string matching algorithm to solve the k-mismatch problem, with average complexity O(n + m). On the other hand, we show how to employ this idea in conjunction with an index over the text, allowing to search a pattern, with up to k mismatches, in time proportional to its length. This novel tool— rNA (randomized Numerical Aligner)—outperforms available tools like SOAP2, BWA, and BOWTIE, processing up to 10 times more patterns per second on texts of (practically) signiﬁcant lengths.

Alberto Policriti, Alexandru I. Tomescu, Francesco

Real-time Traffic

Automata Theory | Chemical Reading Errors | LATA 2010 | References Sequence Genome | Short Genomic Sequences |

claim paper

» Effects of LongRange Correlations in DNA on Sequence Alignment Score Statistics

» EMstyle optimization of hidden conditional random fields for graphemetophoneme conversion

» MPIPairwiseStatSig parallel pairwise statistical significance estimation of local sequence...

» Local sequence alignments statistics deviations from Gumbel statistics in the rareevent ta...

» Alignment Statistics for LongRange Correlated Genomic Sequences

» NonRigid Image Transformation for Assessing Changes in Fluorescence Imaging Data of Molecu...

» Automatic extraction of road intersections from raster maps

» UtilityOptimal Medium Access Control Reverse and Forward Engineering

Post Info
More Details (n/a)

Added	09 Jul 2010
Updated	09 Jul 2010
Type	Conference
Year	2010
Where	LATA
Authors	Alberto Policriti, Alexandru I. Tomescu, Francesco Vezzi

Comments (0)

Sciweavers

A Randomized Numerical Aligner (rNA)

Automata Theory | Chemical Reading Errors | LATA 2010 | References Sequence Genome | Short Genomic Sequences |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers