Authors
Emine Yilmaz,
Gabriella Kazai,
Nick Craswell,
Publication date
2012
Publisher
Total citations
Description
In information retrieval, relevance judgments play an important role as they are required both for evaluating the quality of retrieval systems and for training learning to rank algorithms. In recent years, numerous papers have been published using judgments obtained from a commercial search engine by researchers in industry. As typically no information is provided about the quality of these judgments, their reliability for evaluating/training retrieval systems remains questionable. In this paper, we analyze the reliability of such judgments for evaluating the quality of retrieval systems by comparing them to judgments by NIST judges at TREC.