Paper
The PLUTO Code on GPUs: Offloading Lagrangian Particle Methods
Authors
Alessio Suriano, Stefano Truzzi, Agnese Costa, Marco Rossazza, Nitin Shukla, Andrea Mignone, Vittoria Berta, Claudio Zanni
Abstract
The Lagrangian Particles (LP) module of the PLUTO code offers a powerful simulation tool to predict the non-thermal emission produced by shock accelerated particles in large-scale relativistic magnetized astrophysics flows. The LPs represent ensembles of relativistic particles with a given energy distribution which is updated by solving the relativistic cosmic ray transport equation. The approach consistently includes the effects of adiabatic expansion, synchrotron and inverse Compton emission. The large scale nature of such systems creates boundless computational demand which can only be satisfied by targeting modern computing hardware such as Graphic Processing Units (GPUs). In this work we presents the GPU-compatible C++ re-design of the LP module, that, by means of the programming model OpenACC and the Message Passing Interface library, is capable of targeting both single commercial GPUs as well as multi-node (pre-)exascale computing facilities. The code has been benchmarked up to 28672 parallel CPUs cores and 1024 parallel GPUs demonstrating $\sim(80-90)\%$ weak scaling parallel efficiency and good strong scaling capabilities. Our results demonstrated a speedup of $6$ times when solving that same benchmark test with 128 full GPU nodes (4GPUs per node) against the same amount of full high-end CPU nodes (112 cores per node). Furthermore, we conducted a code verification by comparing its prediction to corresponding analytical solutions for two test cases. We note that this work is part of broader project that aims at developing gPLUTO, the novel and revised GPU-ready implementation of its legacy.
Metadata
Related papers
Fractal universe and quantum gravity made simple
Fabio Briscese, Gianluca Calcagni • 2026-03-25
POLY-SIM: Polyglot Speaker Identification with Missing Modality Grand Challenge 2026 Evaluation Plan
Marta Moscati, Muhammad Saad Saeed, Marina Zanoni, Mubashir Noman, Rohan Kuma... • 2026-03-25
LensWalk: Agentic Video Understanding by Planning How You See in Videos
Keliang Li, Yansong Li, Hongze Shen, Mengdi Liu, Hong Chang, Shiguang Shan • 2026-03-25
Orientation Reconstruction of Proteins using Coulomb Explosions
Tomas André, Alfredo Bellisario, Nicusor Timneanu, Carl Caleman • 2026-03-25
The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series
Jan Hemmerling, Marcel Schwieder, Philippe Rufin, Leon-Friedrich Thomas, Mire... • 2026-03-25
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2602.23434v1</id>\n <title>The PLUTO Code on GPUs: Offloading Lagrangian Particle Methods</title>\n <updated>2026-02-26T19:01:15Z</updated>\n <link href='https://arxiv.org/abs/2602.23434v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2602.23434v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>The Lagrangian Particles (LP) module of the PLUTO code offers a powerful simulation tool to predict the non-thermal emission produced by shock accelerated particles in large-scale relativistic magnetized astrophysics flows. The LPs represent ensembles of relativistic particles with a given energy distribution which is updated by solving the relativistic cosmic ray transport equation. The approach consistently includes the effects of adiabatic expansion, synchrotron and inverse Compton emission. The large scale nature of such systems creates boundless computational demand which can only be satisfied by targeting modern computing hardware such as Graphic Processing Units (GPUs). In this work we presents the GPU-compatible C++ re-design of the LP module, that, by means of the programming model OpenACC and the Message Passing Interface library, is capable of targeting both single commercial GPUs as well as multi-node (pre-)exascale computing facilities. The code has been benchmarked up to 28672 parallel CPUs cores and 1024 parallel GPUs demonstrating $\\sim(80-90)\\%$ weak scaling parallel efficiency and good strong scaling capabilities. Our results demonstrated a speedup of $6$ times when solving that same benchmark test with 128 full GPU nodes (4GPUs per node) against the same amount of full high-end CPU nodes (112 cores per node). Furthermore, we conducted a code verification by comparing its prediction to corresponding analytical solutions for two test cases. We note that this work is part of broader project that aims at developing gPLUTO, the novel and revised GPU-ready implementation of its legacy.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='astro-ph.HE'/>\n <category scheme='http://arxiv.org/schemas/atom' term='astro-ph.IM'/>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.DC'/>\n <published>2026-02-26T19:01:15Z</published>\n <arxiv:comment>Published in Astronomy and Computing. Special issue: Advancing Cosmology and Astrophysics through High-Performance Computing and Machine Learning</arxiv:comment>\n <arxiv:primary_category term='astro-ph.HE'/>\n <arxiv:journal_ref>Astronomy and Computing, Volume 55, 2026, 101088, ISSN 2213-1337</arxiv:journal_ref>\n <author>\n <name>Alessio Suriano</name>\n </author>\n <author>\n <name>Stefano Truzzi</name>\n </author>\n <author>\n <name>Agnese Costa</name>\n </author>\n <author>\n <name>Marco Rossazza</name>\n </author>\n <author>\n <name>Nitin Shukla</name>\n </author>\n <author>\n <name>Andrea Mignone</name>\n </author>\n <author>\n <name>Vittoria Berta</name>\n </author>\n <author>\n <name>Claudio Zanni</name>\n </author>\n <arxiv:doi>10.1016/j.ascom.2026.101088</arxiv:doi>\n <link href='https://doi.org/10.1016/j.ascom.2026.101088' rel='related' title='doi'/>\n </entry>"
}