2010/013 - Information-Based Models for Ad Hoc IR
Stéphane Clinchant, Eric Gaussier
ACM SIGIR 2010 (Special Interest Group on Information Retrieval), Geneva, Switzerland, 19-23 July 2010
We introduce in this paper the family of information-based models for ad hoc information retrieval. These models draw their inspiration from a long-standing hypothesis in IR, namely the fact that the difference in the behaviors of a word at the document and collection levels brings information on the significance of the word for the document. This hypothesis has been exploited in the 2-Poisson mixture models, in the notion of eliteness in BM25, and more recently in DFR models. We show here that, combined with notions related to burstiness, it can lead to simpler and better models.