Towards better measures: evaluation of estimated resource description quality for distributed IR

Mark  Baillie; Leif Azzopardi; Fabio Crestani

1st International ICST Conference on Scalable Information Systems

Research Article

Towards better measures: evaluation of estimated resource description quality for distributed IR

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.1145/1146847.1146888,
    author={Mark  Baillie and Leif Azzopardi and Fabio Crestani},
    title={Towards better measures: evaluation of estimated resource description quality for distributed IR},
    proceedings={1st International ICST Conference on Scalable Information Systems},
    publisher={ACM},
    proceedings_a={INFOSCALE},
    year={2006},
    month={6},
    keywords={},
    doi={10.1145/1146847.1146888}
}

Mark Baillie
Leif Azzopardi
Fabio Crestani
Year: 2006
Towards better measures: evaluation of estimated resource description quality for distributed IR
INFOSCALE
ACM
DOI: 10.1145/1146847.1146888

Mark Baillie^1,2^,*, Leif Azzopardi^1,2^,*, Fabio Crestani^1,2^,*

1: Computer and Information Science Department, University of Strathclyde
2: Glasgow, United Kingdom

*Contact email: mb@cis.strath.ac.uk, leif@cis.strath.ac.uk, fabioc@cis.strath.ac.uk

Abstract

An open problem for Distributed Information Retrieval systems (DIR) is how to represent large document repositories, also known as resources, both accurately and efficiently. Obtaining resource description estimates is an important phase in DIR, especially in non-cooperative environments. Measuring the quality of an estimated resource description is a contentious issue as current measures do not provide an adequate indication of quality. In this paper, we provide an overview of these currently applied measures of resource description quality, before proposing the Kullback-Leibler (KL) divergence as an alternative. Through experimentation we illustrate the shortcomings of these past measures, whilst providing evidence that KL is a more appropriate measure of quality. When applying KL to compare different QBS algorithms, our experiments provide strong evidence in favour of a previously unsupported hypothesis originally posited in the initial Query-Based Sampling work.

Published: 2006-06-01
Publisher: ACM

: http://dx.doi.org/10.1145/1146847.1146888

Towards better measures: evaluation of estimated resource description quality for distributed IR

Abstract

About EAI

Community

Publish with EAI