We have been developing a data mining (i.e., knowledge discovery) framework, MADAM ID, for Mining Audit Data for Automated Models for Intrusion Detection [LSM98, LSM99b, LSM99a]. The 1998 DARPA Intrusion Detection Evaluation showed that the models produced by MADAM ID performed comparably well with the best purely knowledge-engineered systems. Although our data mining techniques have shown great potentials, it is important recognize the critical roles that domain knowledge, and thus knowledge engineering, play in the process of building ID models. In this paper, we examine why domain knowledge is required in the data mining process, and suggest how to combine knowledge discovery and knowledge engineering to build IDSs. We first briefly review the main ideas behind MADAM ID. The main components of MADAM ID are classification and meta-classification [CS93] programs, association rules [AIS93] and frequent episodes [MTV95] programs, a feature construction system, and a conversion syst...
Wenke Lee, Salvatore J. Stolfo