Semantic Query Optimisation makes use of the semantic knowledge of a database (rules) to perform query transformation. Rules are normally learned from former queries fired by the user. Over time, however, this can result in the rule set becoming very large thereby degrading the efficiency of the system as a whole. Such a problem is known as the utility problem. This paper seeks to provide a solution to the utility problem through the use of statistical techniques in selecting and maintaining an optimal rule set. Statistical methods have, in fact, been used widely in the field of Knowledge Discovery to identify and measure relationships between attributes. Here we extend the approach to Semantic Query Optimisation using the Chi-square statistical method which is integrated into a prototype query optimiser developed by the authors. We also present a new technique for calculating Chi-square, which is faster and more efficient than the traditional method in this situation.
Barry G. T. Lowden, Jerome Robinson