Sciweavers

GECCO
2009
Springer

Creating regular expressions as mRNA motifs with GP to predict human exon splitting

14 years 7 months ago
Creating regular expressions as mRNA motifs with GP to predict human exon splitting
Low correlation between mRNA concentrations measured at different locations for the same exon show many current Ensembl exon definitions are incomplete. Automatically created patterns (e.g. TCTTT) in genic DNA sequences identify potential new alternative transcripts. Strongly typed grammar based genetic programming (GP) is used to evolve regular expressions (RE) to classify gene exons with potential alternative mRNA expression from those without. RNAnet gives us correlations between Affymetrix HG-U133 Plus 2 GeneChip probe measurements for the same exon across 2757 Homo Sapiens tissue samples from NCBI’s GEO database. We identify many non-atomic Ensembl exons. I.e. exons with substructure. Biological patterns can be data mined by a Backus-Naur form (BNF) context-free grammar using a strongly typed GP written in gawk and using egrep. The automatically produced DNA motifs suggest that alternative polyadenylation is not responsible. (Short version in [19].) The training data is avai...
William B. Langdon, Joanna Rowsell, Andrew P. Harr
Added 26 May 2010
Updated 26 May 2010
Type Conference
Year 2009
Where GECCO
Authors William B. Langdon, Joanna Rowsell, Andrew P. Harrison
Comments (0)