Research

Paper

TESTING March 11, 2026

Contract And Conquer: How to Provably Compute Adversarial Examples for a Black-Box Model?

Authors

Anna Chistyakova, Mikhail Pautov

Abstract

Black-box adversarial attacks are widely used as tools to test the robustness of deep neural networks against malicious perturbations of input data aimed at a specific change in the output of the model. Such methods, although they remain empirically effective, usually do not guarantee that an adversarial example can be found for a particular model. In this paper, we propose Contract And Conquer (CAC), an approach to provably compute adversarial examples for neural networks in a black-box manner. The method is based on knowledge distillation of a black-box model on an expanding distillation dataset and precise contraction of the adversarial example search space. CAC is supported by the transferability guarantee: we prove that the method yields an adversarial example for the black-box model within a fixed number of algorithm iterations. Experimentally, we demonstrate that the proposed approach outperforms existing state-of-the-art black-box attack methods on ImageNet dataset for different target models, including vision transformers.

Metadata

arXiv ID: 2603.10689
Provider: ARXIV
Primary Category: cs.LG
Published: 2026-03-11
Fetched: 2026-03-12 04:21

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.10689v1</id>\n    <title>Contract And Conquer: How to Provably Compute Adversarial Examples for a Black-Box Model?</title>\n    <updated>2026-03-11T11:58:59Z</updated>\n    <link href='https://arxiv.org/abs/2603.10689v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.10689v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Black-box adversarial attacks are widely used as tools to test the robustness of deep neural networks against malicious perturbations of input data aimed at a specific change in the output of the model. Such methods, although they remain empirically effective, usually do not guarantee that an adversarial example can be found for a particular model. In this paper, we propose Contract And Conquer (CAC), an approach to provably compute adversarial examples for neural networks in a black-box manner. The method is based on knowledge distillation of a black-box model on an expanding distillation dataset and precise contraction of the adversarial example search space. CAC is supported by the transferability guarantee: we prove that the method yields an adversarial example for the black-box model within a fixed number of algorithm iterations. Experimentally, we demonstrate that the proposed approach outperforms existing state-of-the-art black-box attack methods on ImageNet dataset for different target models, including vision transformers.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.LG'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.AI'/>\n    <published>2026-03-11T11:58:59Z</published>\n    <arxiv:primary_category term='cs.LG'/>\n    <author>\n      <name>Anna Chistyakova</name>\n    </author>\n    <author>\n      <name>Mikhail Pautov</name>\n    </author>\n  </entry>"
}