Research

Paper

TESTING March 05, 2026

Challenges and Design Considerations for Finding CUDA Bugs Through GPU-Native Fuzzing

Authors

Mingkai Li, Joseph Devietti, Suman Jana, Tanvir Ahmed Khan

Abstract

Modern computing is shifting from homogeneous CPU-centric systems to heterogeneous systems with closely integrated CPUs and GPUs. While the CPU software stack has benefited from decades of memory safety hardening, the GPU software stack remains dangerously immature. This discrepancy presents a critical ethical challenge: the world's most advanced AI and scientific workloads are increasingly deployed on vulnerable hardware components. In this paper, we study the key challenges of ensuring memory safety on heterogeneous systems. We show that, while the number of exploitable bugs in heterogeneous systems rises every year, current mitigation methods often rely on unfaithful translations, i.e., converting GPU programs to run on CPUs for testing, which fails to capture the architectural differences between CPUs and GPUs. We argue that the faithfulness of the program behavior is at the core of secure and reliable heterogeneous systems design. To ensure faithfulness, we discuss several design considerations of a GPU-native fuzzing pipeline for CUDA programs.

Metadata

arXiv ID: 2603.05725
Provider: ARXIV
Primary Category: cs.CR
Published: 2026-03-05
Fetched: 2026-03-09 06:05

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.05725v1</id>\n    <title>Challenges and Design Considerations for Finding CUDA Bugs Through GPU-Native Fuzzing</title>\n    <updated>2026-03-05T22:26:36Z</updated>\n    <link href='https://arxiv.org/abs/2603.05725v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.05725v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Modern computing is shifting from homogeneous CPU-centric systems to heterogeneous systems with closely integrated CPUs and GPUs. While the CPU software stack has benefited from decades of memory safety hardening, the GPU software stack remains dangerously immature. This discrepancy presents a critical ethical challenge: the world's most advanced AI and scientific workloads are increasingly deployed on vulnerable hardware components.\n  In this paper, we study the key challenges of ensuring memory safety on heterogeneous systems. We show that, while the number of exploitable bugs in heterogeneous systems rises every year, current mitigation methods often rely on unfaithful translations, i.e., converting GPU programs to run on CPUs for testing, which fails to capture the architectural differences between CPUs and GPUs. We argue that the faithfulness of the program behavior is at the core of secure and reliable heterogeneous systems design. To ensure faithfulness, we discuss several design considerations of a GPU-native fuzzing pipeline for CUDA programs.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CR'/>\n    <published>2026-03-05T22:26:36Z</published>\n    <arxiv:comment>Accepted to appear in HotEthics 2026; 6 pages, 1 figure</arxiv:comment>\n    <arxiv:primary_category term='cs.CR'/>\n    <author>\n      <name>Mingkai Li</name>\n    </author>\n    <author>\n      <name>Joseph Devietti</name>\n    </author>\n    <author>\n      <name>Suman Jana</name>\n    </author>\n    <author>\n      <name>Tanvir Ahmed Khan</name>\n    </author>\n  </entry>"
}