Fragment-based approximate retrieval in highly heterogeneous XML collections

15 years 7 months ago

Download krono.act.uji.es

Due to the heterogeneous nature of XML data for internet applications exact matching of queries is often inadequate. The need arises to quickly identify subtrees of XML documents in a collection that are similar to a given pattern. Similarity involves both tags, that are not required to coincide, and structure, in which not all the relationships among nodes in the tree structure are strictly preserved. In this paper we present an efficient approach to the identification of similar subtrees, relying on ad-hoc indexing structures. The approach allows to quickly detect, in a heterogeneous document collection, the minimal portions that exhibit some similarity with the pattern. These candidate portions are then ranked according to their actual similarity. The approach supports different notions of similarity, thus it can be customized to different application domains. In the paper, three different similarity measures are proposed and compared. The approach is experimentally validated and t...

Ismael Sanz, Marco Mesiti, Giovanna Guerrini, Rafa

Real-time Traffic

Ad-hoc Indexing Structures | DKE 2008 | Heterogeneous Document Collection | Heterogeneous Nature |

claim paper

» Information Retrieval of Sequential Data in Heterogeneous XML Databases

» Relevance Feedback in XML Retrieval

Post Info
More Details (n/a)

Added	10 Dec 2010
Updated	10 Dec 2010
Type	Journal
Year	2008
Where	DKE
Authors	Ismael Sanz, Marco Mesiti, Giovanna Guerrini, Rafael Berlanga Llavori

Comments (0)

Sciweavers

Fragment-based approximate retrieval in highly heterogeneous XML collections

Ad-hoc Indexing Structures | DKE 2008 | Heterogeneous Document Collection | Heterogeneous Nature |

Explore & Download

Productivity Tools

Sciweavers