Research

Paper

TESTING March 03, 2026

Does Fine-tuning by Reinforcement Learning Improve Generalization in Binary Speech Deepfake Detection?

Authors

Xin Wang, Ge Wanying, Junichi Yamagishi

Abstract

Building speech deepfake detection models that are generalizable to unseen attacks remains a challenging problem. Although the field has shifted toward a pre-training and fine-tuning paradigm using speech foundation models, most approaches rely solely on supervised fine-tuning (SFT). Inspired by the field of large language models, wherein reinforcement learning (RL) is used for model fine-tuning, we investigate the impact of RL, specifically Group Relative Policy Optimization (GRPO). The results from experiments using multiple detectors and test sets indicate that pure GRPO-based fine-tuning improves performance on out-of-domain test sets while maintaining performance on target-domain test data. This approach outperforms both SFT-only and hybrid setups. Our ablation studies further suggest that the negative reward in GRPO may be a key factor in this improvement.

Metadata

arXiv ID: 2603.02914

Provider: ARXIV

Primary Category: eess.AS

Published: 2026-03-03

Fetched: 2026-03-04 03:41

Related papers

Fractal universe and quantum gravity made simple

Fabio Briscese, Gianluca Calcagni • 2026-03-25

POLY-SIM: Polyglot Speaker Identification with Missing Modality Grand Challenge 2026 Evaluation Plan

Marta Moscati, Muhammad Saad Saeed, Marina Zanoni, Mubashir Noman, Rohan Kuma... • 2026-03-25

LensWalk: Agentic Video Understanding by Planning How You See in Videos

Keliang Li, Yansong Li, Hongze Shen, Mengdi Liu, Hong Chang, Shiguang Shan • 2026-03-25

Orientation Reconstruction of Proteins using Coulomb Explosions

Tomas André, Alfredo Bellisario, Nicusor Timneanu, Carl Caleman • 2026-03-25

The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series

Jan Hemmerling, Marcel Schwieder, Philippe Rufin, Leon-Friedrich Thomas, Mire... • 2026-03-25

Raw Data (Debug)

{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.02914v1</id>\n    <title>Does Fine-tuning by Reinforcement Learning Improve Generalization in Binary Speech Deepfake Detection?</title>\n    <updated>2026-03-03T12:13:53Z</updated>\n    <link href='https://arxiv.org/abs/2603.02914v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.02914v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Building speech deepfake detection models that are generalizable to unseen attacks remains a challenging problem. Although the field has shifted toward a pre-training and fine-tuning paradigm using speech foundation models, most approaches rely solely on supervised fine-tuning (SFT). Inspired by the field of large language models, wherein reinforcement learning (RL) is used for model fine-tuning, we investigate the impact of RL, specifically Group Relative Policy Optimization (GRPO). The results from experiments using multiple detectors and test sets indicate that pure GRPO-based fine-tuning improves performance on out-of-domain test sets while maintaining performance on target-domain test data. This approach outperforms both SFT-only and hybrid setups. Our ablation studies further suggest that the negative reward in GRPO may be a key factor in this improvement.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='eess.AS'/>\n    <published>2026-03-03T12:13:53Z</published>\n    <arxiv:comment>Submitted to Interspeech 2026; put on arxiv based on requirement of paper open-access rule; quote from Interspeech: \"Interspeech no longer enforces an anonymity period for submissions. While uploading a version online is permitted, your official submission to Interspeech must not contain any author-identifying information\"</arxiv:comment>\n    <arxiv:primary_category term='eess.AS'/>\n    <author>\n      <name>Xin Wang</name>\n    </author>\n    <author>\n      <name>Ge Wanying</name>\n    </author>\n    <author>\n      <name>Junichi Yamagishi</name>\n    </author>\n  </entry>"
}