Abstract
Exploiting information induced from (query-specific) clustering of top-retrieved docu- ments has long been proposed as a means for improving precision at the very top ranks of the returned results. We present a novel language model approach to ranking query-specific clusters by the presumed percentage of relevant documents that they contain. While most previous cluster ranking approaches focus on the cluster as a whole, our model utilizes also information induced from documents associated with the cluster. Our model substan- tially outperforms previous approaches for identifying clusters containing a high relevant- document percentage. Furthermore, using the model to produce document ranking yields precision-at-top-ranks performance that is consistently better than that of the initial rank- ing upon which clustering is performed. The performance also favorably compares with that of a state-of-the-art pseudo-feedback-based retrieval method.
| Original language | English |
|---|---|
| Pages (from-to) | 367-395 |
| Number of pages | 29 |
| Journal | Journal Of Artificial Intelligence Research |
| Volume | 41 |
| DOIs | |
| State | Published - May 2011 |
ASJC Scopus subject areas
- Artificial Intelligence
Fingerprint
Dive into the research topics of 'The opposite of smoothing: A language model approach to ranking query-specific document clusters'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver