Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems

Ravana, Sri Devi; TAHERI, MASUMEH SADAT; Rajagopal, Prabha

doi:10.1108/AJIM-12-2014-0171

Article navigation

Volume 67, Issue 4

20 July 2015

Editors

Dan Wu

Search for other works by this author on:

This Site

PubMed

Google Scholar

Research Article| July 20 2015

Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems

Sri Devi Ravana;

Sri Devi Ravana

Department of Information System, University of Malaya, Kuala Lumpur, Malaysia

Search for other works by this author on:

This Site

PubMed

Google Scholar

MASUMEH SADAT TAHERI;

MASUMEH SADAT TAHERI

Department of Information System, University of Malaya, Kuala Lumpur, Malaysia

Search for other works by this author on:

This Site

PubMed

Google Scholar

Prabha Rajagopal

Department of Information System, University of Malaya, Kuala Lumpur, Malaysia

Search for other works by this author on:

This Site

PubMed

Google Scholar

Author & Article Information

Publisher: Emerald Publishing

Online ISSN: 2050-3814

Print ISSN: 2050-3806

2015

Aslib Journal of Information Management (2015) 67 (4): 408–421.

https://doi.org/10.1108/AJIM-12-2014-0171

Purpose

– The purpose of this paper is to propose a method to have more accurate results in comparing performance of the paired information retrieval (IR) systems with reference to the current method, which is based on the mean effectiveness scores of the systems across a set of identified topics/queries.

Design/methodology/approach

– Based on the proposed approach, instead of the classic method of using a set of topic scores, the documents level scores are considered as the evaluation unit. These document scores are the defined document’s weight, which play the role of the mean average precision (MAP) score of the systems as a significance test’s statics. The experiments were conducted using the TREC 9 Web track collection.

Findings

– The p-values generated through the two types of significance tests, namely the Student’s t-test and Mann-Whitney show that by using the document level scores as an evaluation unit, the difference between IR systems is more significant compared with utilizing topic scores.

Originality/value

– Utilizing a suitable test collection is a primary prerequisite for IR systems comparative evaluation. However, in addition to reusable test collections, having an accurate statistical testing is a necessity for these evaluations. The findings of this study will assist IR researchers to evaluate their retrieval systems and algorithms more accurately.

2015

You do not currently have access to this content.

Don't already have an account? Register

Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems

Email Alerts

Cited By

Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems Available to Purchase

Sign in

Client Account

ICE Member Sign In

Email Alerts

Suggested Reading

Related Chapters

Recommended for you

Cited By

Document-based approach to improve the accuracy of pairwise comparison in evaluating information retrieval systems