Background: We present an approach designed to identify gene regulation patterns using sequence and expression data collected for Saccharomyces cerevisae. Our main goal is to relate the combinations of transcription factor binding sites (also referred to as binding site modules) identified in gene promoters to the expression of these genes. The novel aspects include local expression similarity clustering and an exact IF-THEN rule inference algorithm. We also provide a method of rule generalization to include genes with unknown expression profiles. Results: We have implemented the proposed framework and tested it on publicly available datasets from yeast S. cerevisae. The testing procedure consists of thorough statistical analyses of the groups of genes matching the rules we infer from expression data against known sets of coregulated genes. For this purpose we have used published ChIP-Chip data and Gene Ontology annotations. In order to make these tests more objective we compare our r...
Bartek Wilczynski, Torgeir R. Hvidsten, Andriy Kry