Genome-wide microarray designs containing millions to hundreds of millions of probes are available for a variety of mammals, including mouse and human. These genome tiling arrays can potentially lead to significant advances in science and medicine, e.g., by indicating new genes and alternative primary and secondary transcripts. While bottom-up pattern matching techniques (e.g., hierarchical clustering) can be used to find gene structures in microarray data, we believe the many interacting hidden variables and complex noise patterns more naturally lead to an analysis based on generative models. We describe a generative model of tiling data and show how the sum-product algorithm can be used to infer hybridization noise, probe sensitivity, new transcripts, and alternative transcripts. The method, called GenRate, maximizes a global scoring function that enables multiple transcripts to compete for ownership of putative probes. We apply GenRate to a new exon tiling dataset from mouse chromo...
Brendan J. Frey, Quaid Morris, Timothy R. Hughes