Sciweavers

ICSE
1998
IEEE-ACM

Extracting Concepts from File Names: A New File Clustering Criterion

14 years 4 months ago
Extracting Concepts from File Names: A New File Clustering Criterion
Decomposing complex software systems into conceptually independent subsystems is a significant software engineering activity which received considerable research attention. Most of the research in this domain considers the body of the source code; trying to cluster together files which are conceptually related. This paper discusses techniques for extracting concepts (we call them "abbreviations" ) from a more informal source of information: file names. The task is difficult because nothing indicates where to split the file names into substrings. In general, finding abbreviations would require domain knowledge to identify the concepts that are referred to in a name and intuition to recognize such concepts in abbreviated forms. We show by experiment that the techniques we propose allow about 90% of the abbreviations to be found automatically. KEYWORDS Reverse Engineering, Design Recovery, Artificial Intelligence, Program-Understanding
Nicolas Anquetil, Timothy Lethbridge
Added 05 Aug 2010
Updated 05 Aug 2010
Type Conference
Year 1998
Where ICSE
Authors Nicolas Anquetil, Timothy Lethbridge
Comments (0)