Thispaperintroducesa methodfor identifyingempirically conservedaminoacid substitution groups.In contrast with existing approachesthat viewaminoacid substitution as a pairwisephenomenon,the methodpresentedhere identifies conservedgroups of aminoacids using a data structure called a conditionaldistribution matrix. Theconditional distribution matrix extends the concept of a pairwise substitutionmatrixby changingthe contextof substitution froma single aminoacid to a groupof aminoacids. The matrix tabulates informationfroma database of protein families that containsnumerousalignedpositions. Eachrow in the matrixcontains the distribution of aminoacids in those aligned positions that contain a givenconditioning groupof aminoacids. Themethodconverts a database of proteinfamilies into a conditionaldistribution matrixand then examineseachpossiblesubstitutiongroupfor evidence of conservation.Thealgorithmis appliedto the BLOCKSand HSSPdatabases.Twentyaminoacid substitution groupsare foundto be co...
Thomas D. Wu, Douglas L. Brutlag