XCluster Synopses for Structured XML Content

16 years 8 months ago

Download users.soe.ucsc.edu

We tackle the difficult problem of summarizing the path/branching structure and value content of an XML database that comprises both numeric and textual values. We introduce a novel XML-summarization model, termed XCLUSTERs, that enables accurate selectivity estimates for the class of twig queries with numeric-range, substring, and textual IR predicates over the content of XML elements. In a nutshell, an XCLUSTER synopsis represents an effective clustering of XML elements based on both their structural and value-based characteristics. By leveraging techniques for summarizing XML-document structure as well as numeric and textual data distributions, our XCLUSTER model provides the first known unified framework for handling path/branching structure and different types of element values. We detail the XCLUSTER model, and develop a systematic framework for the construction of effective XCLUSTER summaries within a specified storage budget. Experimental results on synthetic and real-life dat...

Neoklis Polyzotis, Minos N. Garofalakis

Real-time Traffic

Database | ICDE 2006 | Structured Xml Content | Textual Data Distributions | XML Elements |

claim paper

Added	01 Nov 2009
Updated	01 Nov 2009
Type	Conference
Year	2006
Where	ICDE
Authors	Neoklis Polyzotis, Minos N. Garofalakis

Sciweavers

XCluster Synopses for Structured XML Content

Database | ICDE 2006 | Structured Xml Content | Textual Data Distributions | XML Elements |

Explore & Download

Productivity Tools

Sciweavers