The cluster hypothesis in information retrieval

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


The cluster hypothesis states that "closely associated documents tend to be relevant to the same requests" [45]. This is one of the most fundamental and influential hypotheses in the field of information retrieval and has given rise to a huge body of work. In this tutorial we will present the research topics that have emerged based on the cluster hypothesis. Specific focus will be placed on cluster-based document retrieval, the use of topic models for ad hoc IR, and the use of graph-based methods that utilize inter-document similarities. Furthermore, we will provide an in-depth survey of the suite of retrieval methods that rely, either explicitly or implicitly, on the cluster hypothesis and which are used for a variety of different tasks; e.g., query expansion, query-performance prediction, fusion and federated search, and search results diversification.

Original languageEnglish
Title of host publicationAdvances in Information Retrieval - 36th European Conference on IR Research, ECIR 2014, Proceedings
Number of pages4
StatePublished - 2014
Event36th European Conference on Information Retrieval, ECIR 2014 - Amsterdam, Netherlands
Duration: 13 Apr 201416 Apr 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8416 LNCS


Conference36th European Conference on Information Retrieval, ECIR 2014

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'The cluster hypothesis in information retrieval'. Together they form a unique fingerprint.

Cite this