The problemofmultipleglobalcomparisonin familiesof biologicalsequences has been wellstudied. Fewer algorithms have been developed for identifying local consensus patterns or motifs in biological sequence. These two important problems have di erent biological constraints and, consequently, di erent computational approaches. The di culty of nding the biologicallymeaningful motifs results from1 the variation amongmotifbases, 2 the alignmentof motifposition sites among the sequences, and 3 the multiplicityof motif occurrences within a given sequence. In this paper, we review and compare the main approaches for nding motifs. We also introduce our own approach, DMS, which combines two objective functions with an improved iterative samplingsearch method. We demonstrate the e ectiveness of the various algorithmsby comparing them on 10 real domains and 14 arti cial domains. The main advantage of DMS is that it is better able to nd shorter motifs.
Yuh-Jyh Hu, Suzanne B. Sandmeyer, Dennis F. Kibler