- The activities and function of proteins can potentially be determined by protein sequence motifs. Therefore, obtaining the universally conserved and crossed protein family boundaries protein sequence motifs is crucial. In this study, a fuzzy C-means and an improved K-means clustering algorithm are applied to granulize the entire dataset and analyze each granular respectively. In addition, a modified bi-clustering algorithm is employed to improve clusters' quality. This is the first time bi-clustering algorithm is implemented for clusters extraction proposes. By comparing with the traditional shrink method, the modified bi-clustering algorithm generates more clusters with secondary structure similarity greater than 60% at the same data filtering percentage. Moreover, bi-clustering algorithm is shown to have the ability to select meaningful amino acids that biologists are interested at.