Research

Paper

TESTING March 19, 2026

TENSURE: Fuzzing Sparse Tensor Compilers (Registered Report)

Authors

Kabilan Mahathevan, Yining Zhang, Muhammad Ali Gulzar, Kirshanthan Sundararajah

Abstract

Sparse Tensor Compilers (STCs) have emerged as critical infrastructure for optimizing high-dimensional data analytics and machine learning workloads. The STCs must synthesize complex, irregular control flow for various compressed storage formats directly from high-level declarative specifications, thereby making them highly susceptible to subtle correctness defects. Existing testing frameworks, which rely on mutating computation graphs restricted to a standard vocabulary of operators, fail to exercise the arbitrary loop synthesis capabilities of these compilers. Furthermore, generic grammar-based fuzzers struggle to generate valid inputs due to the strict rules governing how indices are reused across multiple tensors. In this paper, we present TENSURE, the first extensible black-box fuzzing framework specifically designed for the testing of STCs. TENSURE leverages Einstein Summation (Einsum) notation as a general input abstraction, enabling the generation of complex, unconventional tensor contractions that expose corner cases in the code-generation phases of STCs. We propose a novel constraint-based generation algorithm that guarantees 100% semantic validity of synthesized kernels, significantly outperforming the ~3.3% validity rate of baseline grammar fuzzers. To enable metamorphic testing without a trusted reference, we introduce a set of semantic-preserving mutation operators that exploit algebraic commutativity and heterogeneity in storage formats. Our evaluation on two state-of-the-art systems, TACO and Finch, reveals widespread fragility, particularly in TACO, where TENSURE exposed crashes or silent miscompilations in a majority of generated test cases. These findings underscore the critical need for specialized testing tools in the sparse compilation ecosystem.

Metadata

arXiv ID: 2603.18372
Provider: ARXIV
Primary Category: cs.PL
Published: 2026-03-19
Fetched: 2026-03-20 06:02

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.18372v1</id>\n    <title>TENSURE: Fuzzing Sparse Tensor Compilers (Registered Report)</title>\n    <updated>2026-03-19T00:13:14Z</updated>\n    <link href='https://arxiv.org/abs/2603.18372v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.18372v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Sparse Tensor Compilers (STCs) have emerged as critical infrastructure for optimizing high-dimensional data analytics and machine learning workloads. The STCs must synthesize complex, irregular control flow for various compressed storage formats directly from high-level declarative specifications, thereby making them highly susceptible to subtle correctness defects. Existing testing frameworks, which rely on mutating computation graphs restricted to a standard vocabulary of operators, fail to exercise the arbitrary loop synthesis capabilities of these compilers. Furthermore, generic grammar-based fuzzers struggle to generate valid inputs due to the strict rules governing how indices are reused across multiple tensors.\n  In this paper, we present TENSURE, the first extensible black-box fuzzing framework specifically designed for the testing of STCs. TENSURE leverages Einstein Summation (Einsum) notation as a general input abstraction, enabling the generation of complex, unconventional tensor contractions that expose corner cases in the code-generation phases of STCs. We propose a novel constraint-based generation algorithm that guarantees 100% semantic validity of synthesized kernels, significantly outperforming the ~3.3% validity rate of baseline grammar fuzzers. To enable metamorphic testing without a trusted reference, we introduce a set of semantic-preserving mutation operators that exploit algebraic commutativity and heterogeneity in storage formats. Our evaluation on two state-of-the-art systems, TACO and Finch, reveals widespread fragility, particularly in TACO, where TENSURE exposed crashes or silent miscompilations in a majority of generated test cases. These findings underscore the critical need for specialized testing tools in the sparse compilation ecosystem.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.PL'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.SE'/>\n    <published>2026-03-19T00:13:14Z</published>\n    <arxiv:primary_category term='cs.PL'/>\n    <author>\n      <name>Kabilan Mahathevan</name>\n    </author>\n    <author>\n      <name>Yining Zhang</name>\n    </author>\n    <author>\n      <name>Muhammad Ali Gulzar</name>\n    </author>\n    <author>\n      <name>Kirshanthan Sundararajah</name>\n    </author>\n    <arxiv:doi>10.14722/fuzzing.2026.23006</arxiv:doi>\n    <link href='https://doi.org/10.14722/fuzzing.2026.23006' rel='related' title='doi'/>\n  </entry>"
}