Research

Paper

AI LLM February 23, 2026

Spritz: Path-Aware Load Balancing in Low-Diameter Networks

Authors

Tommaso Bonato, Ales Kubicek, Abdul Kabbani, Ahmad Ghalayini, Maciej Besta, Torsten Hoefler

Abstract

Low-diameter topologies such as Dragonfly and Slim Fly are increasingly adopted in HPC and datacenter networks, yet existing load balancing techniques either rely on proprietary in-network mechanisms or fail to utilize the full path diversity of these topologies. We introduce Spritz, a flexible sender-based load balancing framework that shifts adaptive topology-aware routing to the endpoints using only standard Ethernet features. We propose two algorithms, Spritz-Scout and Spritz-Spray that, respectively, explore and adaptively cache efficient paths using ECN, packet trimming, and timeout feedback. Through simulation on Dragonfly and Slim Fly topologies with over 1000 endpoints, Spritz outperforms ECMP, UGAL-L, and prior sender-based approaches by up to 1.8x in flow completion time under AI training and datacenter workloads, while offering robust failover with performance improvements of up to 25.4x under link failures, all without additional hardware support. Spritz enables datacenter-scale, commodity Ethernet networks to efficiently leverage low-diameter topologies, offering unified routing and load balancing for the Ultra Ethernet era.

Metadata

arXiv ID: 2602.19567
Provider: ARXIV
Primary Category: cs.NI
Published: 2026-02-23
Fetched: 2026-02-24 04:38

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2602.19567v1</id>\n    <title>Spritz: Path-Aware Load Balancing in Low-Diameter Networks</title>\n    <updated>2026-02-23T07:33:25Z</updated>\n    <link href='https://arxiv.org/abs/2602.19567v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2602.19567v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Low-diameter topologies such as Dragonfly and Slim Fly are increasingly adopted in HPC and datacenter networks, yet existing load balancing techniques either rely on proprietary in-network mechanisms or fail to utilize the full path diversity of these topologies. We introduce Spritz, a flexible sender-based load balancing framework that shifts adaptive topology-aware routing to the endpoints using only standard Ethernet features. We propose two algorithms, Spritz-Scout and Spritz-Spray that, respectively, explore and adaptively cache efficient paths using ECN, packet trimming, and timeout feedback. Through simulation on Dragonfly and Slim Fly topologies with over 1000 endpoints, Spritz outperforms ECMP, UGAL-L, and prior sender-based approaches by up to 1.8x in flow completion time under AI training and datacenter workloads, while offering robust failover with performance improvements of up to 25.4x under link failures, all without additional hardware support. Spritz enables datacenter-scale, commodity Ethernet networks to efficiently leverage low-diameter topologies, offering unified routing and load balancing for the Ultra Ethernet era.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.NI'/>\n    <published>2026-02-23T07:33:25Z</published>\n    <arxiv:primary_category term='cs.NI'/>\n    <arxiv:journal_ref>Proc. 40th IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2026</arxiv:journal_ref>\n    <author>\n      <name>Tommaso Bonato</name>\n    </author>\n    <author>\n      <name>Ales Kubicek</name>\n    </author>\n    <author>\n      <name>Abdul Kabbani</name>\n    </author>\n    <author>\n      <name>Ahmad Ghalayini</name>\n    </author>\n    <author>\n      <name>Maciej Besta</name>\n    </author>\n    <author>\n      <name>Torsten Hoefler</name>\n    </author>\n  </entry>"
}