Weconsider the problemof automaticdiscoveryof patterns and the corresponding subfamilies in a set of biosequences. Thesequences are unaligned and may contain noise of unknownlevel. Thepatterns are of the type used in PROSITEdatabase. In our approach we discover patterns and the respective subfamilies simultaneously. Wedevelopa theoretically substantiated significance measurefor a set of such patterns and an algorithm approximatingthe best pattern set and the subfamilies. The approach is based on the minimum description length (MDL)principle. Wereport a computing experimentcorrectly finding subfamilies in the family of chromodomainsand revealing newstrong patterns.