This workfocuses on the inference of evolutionary relationships in protein superfamilies, and the uses of these relationships to identify keypositions in the structure, to infer attributes on the basis of evolutionarydistance, andto identify potential errors in sequenceannotations. Relative entropy, a distance metric from information theory, is used in combination with Dirichlet mixturepriors to estimatea phylogenetictree for a set of proteins. This methodinfers key structural or functional positions in the molecule,and guidesthe tree topologyto preserve these important positions within subtrees. Minimum-descriptionlength principles are used to determinea cut of the tree into subtrees, to identify the subfamilies in the data. This methodis demonstrated on SH2-domaincontaining proteins, resulting in a newsubfamily assignment for Src2_dromeand a suggested evolutionary relationship between Nck_humanand Drk_drome, Sem5_caeel, Grb2_humanand Grb2_chick.