VARUN: Discovering Extensible Motifs under Saturation Constraints

14 years 8 months ago

Download www.dei.unipd.it

Abstract-The discovery of motifs in biosequences is frequently torn between the rigidity of the model on the one hand and the abundance of candidates on the other. In particular, motifs that include wildcards or "dont cares" escalate exponentially with their number, and this gets only worse if a dont care is allowed to stretch up to some prescribed maximum length. In this paper, a notion of extensible motif in a sequence is introduced and studied, which tightly combines the structure of the motif pattern, as described by its syntactic specification, with the statistical measure of its occurrence count. It is shown that a combination of appropriate saturation conditions and the monotonicity of probabilistic scores over regions of constant frequency afford us significant parsimony in the generation and testing of candidate overrepresented motifs. A suite of software programs called Varun1 is described, implementing the discovery of extensible motifs of the type considered. The ...

Alberto Apostolico, Matteo Comin, Laxmi Parida

Real-time Traffic

Appropriate Saturation Conditions | Extensible Motifs | Prescribed Maximum Length | Software Engineering | TCBB 2010 |

claim paper

Post Info
More Details (n/a)

Added	21 May 2011
Updated	21 May 2011
Type	Journal
Year	2010
Where	TCBB
Authors	Alberto Apostolico, Matteo Comin, Laxmi Parida

Comments (0)

Sciweavers

VARUN: Discovering Extensible Motifs under Saturation Constraints

Appropriate Saturation Conditions | Extensible Motifs | Prescribed Maximum Length | Software Engineering | TCBB 2010 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers