Background: Identification of the structural domains of proteins is important for our understanding of the organizational principles and mechanisms of protein folding, and for insights into protein function and evolution. Algorithmic methods of dissecting protein of known structure into domains developed so far are based on an examination of multiple geometrical, physical and topological features. Successful as many of these approaches are, they employ a lot of heuristics, and it is not clear whether they illuminate any deep underlying principles of protein domain organization. Other well-performing domain dissection methods rely on comparative sequence analysis. These methods are applicable to sequences with known and unknown structure alike, and their success highlights a fundamental principle of protein modularity, but this does not directly improve our understanding of protein spatial structure. Results: We present a novel graph-theoretical algorithm for the identification of doma...
Frank Emmert-Streib, Arcady R. Mushegian