Research

Paper

TESTING February 26, 2026

Mitigating Membership Inference in Intermediate Representations via Layer-wise MIA-risk-aware DP-SGD

Authors

Jiayang Meng, Tao Huang, Chen Hou, Guolong Zheng, Hong Chen

Abstract

In Embedding-as-an-Interface (EaaI) settings, pre-trained models are queried for Intermediate Representations (IRs). The distributional properties of IRs can leak training-set membership signals, enabling Membership Inference Attacks (MIAs) whose strength varies across layers. Although Differentially Private Stochastic Gradient Descent (DP-SGD) mitigates such leakage, existing implementations employ per-example gradient clipping and a uniform, layer-agnostic noise multiplier, ignoring heterogeneous layer-wise MIA vulnerability. This paper introduces Layer-wise MIA-risk-aware DP-SGD (LM-DP-SGD), which adaptively allocates privacy protection across layers in proportion to their MIA risk. Specifically, LM-DP-SGD trains a shadow model on a public shadow dataset, extracts per-layer IRs from its train/test splits, and fits layer-specific MIA adversaries, using their attack error rates as MIA-risk estimates. Leveraging the cross-dataset transferability of MIAs, these estimates are then used to reweight each layer's contribution to the globally clipped gradient during private training, providing layer-appropriate protection under a fixed noise magnitude. We further establish theoretical guarantees on both privacy and convergence of LM-DP-SGD. Extensive experiments show that, under the same privacy budget, LM-DP-SGD reduces the peak IR-level MIA risk while preserving utility, yielding a superior privacy-utility trade-off.

Metadata

arXiv ID: 2602.22611
Provider: ARXIV
Primary Category: cs.LG
Published: 2026-02-26
Fetched: 2026-02-27 04:35

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2602.22611v1</id>\n    <title>Mitigating Membership Inference in Intermediate Representations via Layer-wise MIA-risk-aware DP-SGD</title>\n    <updated>2026-02-26T04:32:14Z</updated>\n    <link href='https://arxiv.org/abs/2602.22611v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2602.22611v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>In Embedding-as-an-Interface (EaaI) settings, pre-trained models are queried for Intermediate Representations (IRs). The distributional properties of IRs can leak training-set membership signals, enabling Membership Inference Attacks (MIAs) whose strength varies across layers. Although Differentially Private Stochastic Gradient Descent (DP-SGD) mitigates such leakage, existing implementations employ per-example gradient clipping and a uniform, layer-agnostic noise multiplier, ignoring heterogeneous layer-wise MIA vulnerability. This paper introduces Layer-wise MIA-risk-aware DP-SGD (LM-DP-SGD), which adaptively allocates privacy protection across layers in proportion to their MIA risk. Specifically, LM-DP-SGD trains a shadow model on a public shadow dataset, extracts per-layer IRs from its train/test splits, and fits layer-specific MIA adversaries, using their attack error rates as MIA-risk estimates. Leveraging the cross-dataset transferability of MIAs, these estimates are then used to reweight each layer's contribution to the globally clipped gradient during private training, providing layer-appropriate protection under a fixed noise magnitude. We further establish theoretical guarantees on both privacy and convergence of LM-DP-SGD. Extensive experiments show that, under the same privacy budget, LM-DP-SGD reduces the peak IR-level MIA risk while preserving utility, yielding a superior privacy-utility trade-off.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.LG'/>\n    <published>2026-02-26T04:32:14Z</published>\n    <arxiv:primary_category term='cs.LG'/>\n    <author>\n      <name>Jiayang Meng</name>\n    </author>\n    <author>\n      <name>Tao Huang</name>\n    </author>\n    <author>\n      <name>Chen Hou</name>\n    </author>\n    <author>\n      <name>Guolong Zheng</name>\n    </author>\n    <author>\n      <name>Hong Chen</name>\n    </author>\n  </entry>"
}