Authors
Rishabh Mehrotra,
Emine Yilmaz,
Publication date
2015
Publisher
Total citations
Description
The performance of Learning to Rank algorithms strongly depend on the number of labelled queries in the training set, while the cost incurred in annotating a large number of queries with relevance judgements is prohibitively high. As a result, constructing such a training dataset involves selecting a set of candidate queries for labelling. In this work, we investigate query selection strategies for learning to rank aimed at actively selecting unlabelled queries to be labelled so as to minimize the data annotation cost. %total number of labelled queries -- without degrading the ranking performance. In particular, we characterize query selection based on two aspects of emph{informativeness} and emph{representativeness} and propose two novel query selection strategies (i) Permutation Probability based query selection and (ii) Topic Model based query selection which capture the two aspects, respectively. We further …