Weuse a quantitative definition of specificity to developa neural networkfor the identification of commonprotein binding sites in a collection of unaligned DNAfragments. Wedemonstrate the equivalence of the methodto maximizingInformationContent of the aligned sites whensimple modelsof the binding energy and the genomeare employed. The network method subsumes those simple models and is capable of working with morecomplicatedones. This is demonstratedusing a Markovmodelof the E. coil genomeand a sampling methodto approximate the partition function. Avariation of Gibbs' samplingaids in avoiding local minima.
John M. Heumann, Alan S. Lapedes, Gary D. Stormo