Consider a scientist who wants to explore multiple data sets to select the relevant ones for further analysis. Since the visualization real estate may put a stringent constraint on how much detail can be presented to this user in a single page, effective table summarization techniques are needed to create summaries that are both sufficiently small and effective in communicating the available content. In this paper, we first argue that table summarization can benefit from knowledge about acceptable value clustering alternatives for clustering the values in the database. We formulate the problem of table summarization with the help of value lattices. We then provide a framework to express alternative clustering strategies and to account for various utility measures (such as information loss) in assessing different summarization alternatives. Based on this interpretation, we introduce three preference criteria, max-min-util (cautious), max-sum-util (cumulative), and pareto-util, for the ...
K. Selçuk Candan, Huiping Cao, Yan Qi 0002,