Research

Paper

AI LLM March 25, 2026

SumRank: Aligning Summarization Models for Long-Document Listwise Reranking

Authors

Jincheng Feng, Wenhan Liu, Zhicheng Dou

Abstract

Large Language Models (LLMs) have demonstrated superior performance in listwise passage reranking task. However, directly applying them to rank long-form documents introduces both effectiveness and efficiency issues due to the substantially increased context length. To address this challenge, we propose a pointwise summarization model SumRank, aligned with downstream listwise reranking, to compress long-form documents into concise rank-aligned summaries before the final listwise reranking stage. To obtain our summarization model SumRank, we introduce a three-stage training pipeline comprising cold-start Supervised Fine-Tuning (SFT), specialized RL data construction, and rank-driven alignment via Reinforcement Learning. This paradigm aligns the SumRank with downstream ranking objectives to preserve relevance signals. We conduct extensive experiments on five benchmark datasets from the TREC Deep Learning tracks (TREC DL 19-23). Results show that our lightweight SumRank model achieves state-of-the-art (SOTA) ranking performance while significantly improving efficiency by reducing both summarization overhead and reranking complexity.

Metadata

arXiv ID: 2603.24204
Provider: ARXIV
Primary Category: cs.IR
Published: 2026-03-25
Fetched: 2026-03-26 06:02

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.24204v1</id>\n    <title>SumRank: Aligning Summarization Models for Long-Document Listwise Reranking</title>\n    <updated>2026-03-25T11:28:47Z</updated>\n    <link href='https://arxiv.org/abs/2603.24204v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.24204v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Large Language Models (LLMs) have demonstrated superior performance in listwise passage reranking task. However, directly applying them to rank long-form documents introduces both effectiveness and efficiency issues due to the substantially increased context length. To address this challenge, we propose a pointwise summarization model SumRank, aligned with downstream listwise reranking, to compress long-form documents into concise rank-aligned summaries before the final listwise reranking stage. To obtain our summarization model SumRank, we introduce a three-stage training pipeline comprising cold-start Supervised Fine-Tuning (SFT), specialized RL data construction, and rank-driven alignment via Reinforcement Learning. This paradigm aligns the SumRank with downstream ranking objectives to preserve relevance signals. We conduct extensive experiments on five benchmark datasets from the TREC Deep Learning tracks (TREC DL 19-23). Results show that our lightweight SumRank model achieves state-of-the-art (SOTA) ranking performance while significantly improving efficiency by reducing both summarization overhead and reranking complexity.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.IR'/>\n    <published>2026-03-25T11:28:47Z</published>\n    <arxiv:primary_category term='cs.IR'/>\n    <author>\n      <name>Jincheng Feng</name>\n    </author>\n    <author>\n      <name>Wenhan Liu</name>\n    </author>\n    <author>\n      <name>Zhicheng Dou</name>\n    </author>\n  </entry>"
}