Retrieval queries that combine structural constraints with keyword search represent a significant challenge to XML data management systems. Queries are expected to be answered as efficiently and effectively as in traditional keyword search, while satisfying additional constraints. Several XML-retrieval systems support answering queries exhaustively by storing both structural indexes and a keyword index. Other systems answer top-k queries efficiently by constructing indexes in which keyword scores, for some structural elements, are stored in relevance order, enabling approaches such as the threshold algorithm (TA). In this paper we describe TReX, an XML retrieval system that can exploit multiple structural summaries (including newly defined ones). TReX can also self-manage small, redundant, indexes to speed up the evaluation of workloads of top-k queries. The redundant indexes are maintained to enable TReX to select an evaluation strategies among three (and potentially more) retrie...
Mariano P. Consens, Xin Gu, Yaron Kanza, Flavio Ri