A large portion of the software used in industry today is legacy software. Legacy systems often evolve into dicult to maintain systems whose original design has been lost or else no longer closely matches the actual structure of the system. In our paper1 we present a \hybrid" process in which we combine extracted code facts and information derived from interviewing developers to determine the architectural structure of a legacy system. We introduce the steps of this process using a case study of a large legacy system, an optimizing back end for IBM compilers. These steps include collecting \back of the envelope" designs from project personnel, extracting raw facts from the source code, collecting naming conventionsfor les, clustering code artifacts based on naming conventions, creating tentative structural diagrams, and collecting more \live" information in terms of reactions to these tentative diagrams, and so on, until we converge to an architectural structure. Our c...
Vassilios Tzerpos, Richard C. Holt