To fulfill the requirement of fast interactive multidimensional data analysis, database systems precompute aggregate views on some subsets of dimensions and their corresponding hierarchies. However, the problem of what to precompute is difficult and intriguing. The leading existing algorithm, BPUS, has a running time that is polynomial in the number of views and is guaranteed to be within (0.63 - f) of optimal, where f is the fraction of available space consumed by the largest aggregate. Unfortunately, BPUS can be impractically slow, and in some instances may miss good solutions due to the coarse granularity at which it makes its decisions of what to precompute. In view of this, we study the structure of the precomputation problem and show that under certain broad conditions on the multidimensional data, an even simpler and faster algorithm, PBS, achieves the same (0.63 - f) bound. Our empirical study of the behavior of PBS shows that even when this condition does not hold, PBS picks ...
Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton