Authors
Bin Zou,
Vasileios Lampos,
Shangsong Liang,
Emine Yilmaz,
Emine Yilmaz,
Publication date
2017
Publisher
Total citations
Description
We propose an extension to language models for information retrieval. Typically, language models estimate the probability of a document generating the query, where the query is considered as a set of independent search terms. We extend this approach by considering the concepts implied by both the query and words in the document. The model combines the probability of the document generating the concept embodied by the query, and the traditional language model probability of the document generating the query terms. We use a word embedding space to express concepts. The similarity between two vectors in this space is estimated using a weighted cosine distance. The weighting significantly enhances the discrimination between vectors. We evaluate our model on benchmark datasets (TREC 6--8) and empirically demonstrate it outperforms state-of-the-art baselines.