s In data mining, we emphasize the need for learning from huge, incomplete and imperfect data sets (Fayyad et al. 1996, Frawley et al. 1991, Piatetsky-Shapiro and Frawley, 1991). To handle noise in the problem domain, existing learning systems avoid overfitting the imperfect training examples by excluding insignificant patterns. The problem is that these systems use a limiting attribute-value language for representing the training examples and the induced knowledge. Moreover, some important patterns are ignored because they are statistically insignificant. In this paper, we present a framework that combines Genetic Programming (Koza 1992; 1994) and Inductive Logic Programming (Muggleton, 1992) to induce knowledge represented in various knowledge representation formalisms from noisy databases. The framework is based on a formalism of logic grammars and it can specify the search space declaratively. An implementation of the framework, LOGENPRO (The Logic grammar based GENetic PROgramming...
Man Leung Wong, Kwong-Sak Leung, Jack C. Y. Cheng