Towards better measures: evaluation of estimated resource description quality for distributed IR

14 years 6 months ago

Download www.peng-project.org

An open problem for Distributed Information Retrieval systems (DIR) is how to represent large document repositories, also known as resources, both accurately and efﬁciently. Obtaining resource description estimates is an important phase in DIR, especially in non-cooperative environments. Measuring the quality of an estimated resource description is a contentious issue as current measures do not provide an adequate indication of quality. In this paper, we provide an overview of these currently applied measures of resource description quality, before proposing the Kullback-Leibler (KL) divergence as an alternative. Through experimentation we illustrate the shortcomings of these past measures, whilst providing evidence that KL is a more appropriate measure of quality. When applying KL to compare different QBS algorithms, our experiments provide strong evidence in favour of a previously unsupported hypothesis originally posited in the initial Query-Based Sampling work.

Mark Baillie, Leif Azzopardi, Fabio Crestani

Real-time Traffic

INFOSCALE 2006 | Resource Description | Resource Description Estimates | Resource Description Quality |

claim paper

Post Info
More Details (n/a)

Added	13 Jun 2010
Updated	13 Jun 2010
Type	Conference
Year	2006
Where	INFOSCALE
Authors	Mark Baillie, Leif Azzopardi, Fabio Crestani

Comments (0)

Sciweavers

Towards better measures: evaluation of estimated resource description quality for distributed IR

INFOSCALE 2006 | Resource Description | Resource Description Estimates | Resource Description Quality |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers