Summarization of query results is an important problem for many OLAP applications. The Minimum Description Length principle has been applied in various studies to provide summaries. In this paper, we consider a new approach of applying the MDL principle. We study the problem of finding summaries of the form S H for k-d cubes with tree hierarchies. The S part generalizes the query results, while the H part describes all the exceptions to the generalizations. The optimization problem is to minimize the combined cardinalities of S and H. We first characterize the problem by showing that solving the 1-d problem can be done in time linear to the size of hierarchy, but solving the 2-d problem is NPhard. We then develop three different heuristics, based on a greedy approach, a dynamic programming approach and a quadratic programming approach. We conduct a comprehensive experimental evaluation. Both the dynamic programming algorithm and the greedy algorithm can be used for different circu...
Shaofeng Bu, Laks V. S. Lakshmanan, Raymond T. Ng