Data Mining and Knowledge Discovery techniques proved to be efficient tools for variety of complex tasks in biology including DNA research. This paper presents implementation of these techniques for searching regularities in tables of context features of DNA sequences involved in transcription regulation. The goal is to discover regularities that interrelate nucleotide sequences with the functional class of these sequences. The search for regularities is implemented in a software system "Gene Discovery" which is based on first-order probabilistic logic. The "Gene Discovery" system provides a general scenario of functional annotation of an arbitrary nucleotide sequence. This system accepts molecular-genetical data retrieved from the database using SQL queries. Sequences of non-homologous gene promoters extracted from the TRRD database have been analysed using this system. Several regularities have been detected. These regularities relate the context of regulatory DNA...
Eugenii E. Vityaev, Yuri L. Orlov, Oleg V. Vishnev