Research

Paper

AI LLM March 11, 2026

A Systematic Study of Pseudo-Relevance Feedback with LLMs

Authors

Nour Jedidi, Jimmy Lin

Abstract

Pseudo-relevance feedback (PRF) methods built on large language models (LLMs) can be organized along two key design dimensions: the feedback source, which is where the feedback text is derived from and the feedback model, which is how the given feedback text is used to refine the query representation. However, the independent role that each dimension plays is unclear, as both are often entangled in empirical evaluations. In this paper, we address this gap by systematically studying how the choice of feedback source and feedback model impact PRF effectiveness through controlled experimentation. Across 13 low-resource BEIR tasks with five LLM PRF methods, our results show: (1) the choice of feedback model can play a critical role in PRF effectiveness; (2) feedback derived solely from LLM-generated text provides the most cost-effective solution; and (3) feedback derived from the corpus is most beneficial when utilizing candidate documents from a strong first-stage retriever. Together, our findings provide a better understanding of which elements in the PRF design space are most important.

Metadata

arXiv ID: 2603.11008
Provider: ARXIV
Primary Category: cs.IR
Published: 2026-03-11
Fetched: 2026-03-12 04:21

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.11008v1</id>\n    <title>A Systematic Study of Pseudo-Relevance Feedback with LLMs</title>\n    <updated>2026-03-11T17:31:50Z</updated>\n    <link href='https://arxiv.org/abs/2603.11008v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.11008v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Pseudo-relevance feedback (PRF) methods built on large language models (LLMs) can be organized along two key design dimensions: the feedback source, which is where the feedback text is derived from and the feedback model, which is how the given feedback text is used to refine the query representation. However, the independent role that each dimension plays is unclear, as both are often entangled in empirical evaluations. In this paper, we address this gap by systematically studying how the choice of feedback source and feedback model impact PRF effectiveness through controlled experimentation. Across 13 low-resource BEIR tasks with five LLM PRF methods, our results show: (1) the choice of feedback model can play a critical role in PRF effectiveness; (2) feedback derived solely from LLM-generated text provides the most cost-effective solution; and (3) feedback derived from the corpus is most beneficial when utilizing candidate documents from a strong first-stage retriever. Together, our findings provide a better understanding of which elements in the PRF design space are most important.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.IR'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CL'/>\n    <published>2026-03-11T17:31:50Z</published>\n    <arxiv:primary_category term='cs.IR'/>\n    <author>\n      <name>Nour Jedidi</name>\n    </author>\n    <author>\n      <name>Jimmy Lin</name>\n    </author>\n  </entry>"
}