Authors
Debasis Ganguly,
Emine Yilmaz,
Publication date
2023
Publisher
Total citations
Cited by
Description
Due to the massive size of test collections, a standard practice in IR evaluation is to construct a 'pool' of candidate relevant documents comprised of the top-k documents retrieved by a wide range of different retrieval systems - a process called depth-k pooling. A standard practice is to set the depth (k) to a constant value for each query constituting the benchmark set. However, in this paper we argue that the annotation effort can be substantially reduced if the depth of the pool is made a variable quantity for each query, the rationale being that the number of documents relevant to the information need can widely vary across queries. Our hypothesis is that a lower depth for queries with a small number of relevant documents, and a higher depth for those with a larger number of relevant documents can potentially reduce the annotation effort without a significant change in IR effectiveness evaluation.We make use of …