Software document repositories store artifacts produced in the course of developing software products. But most repositories are simply archives of documents. It is not unusual to find projects where different software artifacts are scattered in unrelated repositories with varying levels of granularity and without a centralized management system. This makes the information available in existing repositories difficult to reuse. In this paper, a methodology for constructing an ontologybased repository of reusable knowledge is presented. The information in the repository is extracted from specification documents using text mining. Ontologies are used to guide the extraction process and organize the extracted information. The methodology is being used to develop a repository of recurring and crosscutting aspects in software specification documents.
Yan Wu, Harvey P. Siy, Mansour Zand, Victor L. Win