Weconsider the problem of parsing a sequence into different classes of subsequences.Twocommonexamplesare finding the exons and introns in genomicsequences and identifying the secondary structure domainsof protein sequences. In each case there are various types of evidencethat are relevant to the classification, but none are completely reliable, so weexpect some weightedaverage of all the evidenceto provide improved classifications. For example, in the problemof identifying codingregions in genomic DNA,the combineduse of evidence such as codon bias and splice junction patterns can give more reliable predictions than either type of evidence alone. Weshowthree mainresults:
Gary D. Stormo, David Haussler