An important challenge in software reengineering is to encapsulate collections of related data that, due to the absence of appropriate constructs for encapsulation in legacy programming languages, may be distributed throughout the code. The encapsulation of such collections is a necessary step for reengineering a legacy system into an objectoriented design or implementation. Encapsulating a set of related symbolic constants into an enumeration type is an instance of this problem. We present a classification of how enumeration types are modeled using symbolic constants in real-world programs, a set of heuristics to identify candidate enumeration types, and an experimental evaluation of these heuristics.
John M. Gravley, Arun Lakhotia