Research

Paper

TESTING March 16, 2026

Rethinking LLM Watermark Detection in Black-Box Settings: A Non-Intrusive Third-Party Framework

Authors

Zhuoshang Wang, Yubing Ren, Yanan Cao, Fang Fang, Xiaoxue Li, Li Guo

Abstract

While watermarking serves as a critical mechanism for LLM provenance, existing secret-key schemes tightly couple detection with injection, requiring access to keys or provider-side scheme-specific detectors for verification. This dependency creates a fundamental barrier for real-world governance, as independent auditing becomes impossible without compromising model security or relying on the opaque claims of service providers. To resolve this dilemma, we introduce TTP-Detect, a pioneering black-box framework designed for non-intrusive, third-party watermark verification. By decoupling detection from injection, TTP-Detect reframes verification as a relative hypothesis testing problem. It employs a proxy model to amplify watermark-relevant signals and a suite of complementary relative measurements to assess the alignment of the query text with watermarked distributions. Extensive experiments across representative watermarking schemes, datasets and models demonstrate that TTP-Detect achieves superior detection performance and robustness against diverse attacks.

Metadata

arXiv ID: 2603.14968
Provider: ARXIV
Primary Category: cs.CR
Published: 2026-03-16
Fetched: 2026-03-17 06:02

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.14968v1</id>\n    <title>Rethinking LLM Watermark Detection in Black-Box Settings: A Non-Intrusive Third-Party Framework</title>\n    <updated>2026-03-16T08:28:48Z</updated>\n    <link href='https://arxiv.org/abs/2603.14968v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.14968v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>While watermarking serves as a critical mechanism for LLM provenance, existing secret-key schemes tightly couple detection with injection, requiring access to keys or provider-side scheme-specific detectors for verification. This dependency creates a fundamental barrier for real-world governance, as independent auditing becomes impossible without compromising model security or relying on the opaque claims of service providers. To resolve this dilemma, we introduce TTP-Detect, a pioneering black-box framework designed for non-intrusive, third-party watermark verification. By decoupling detection from injection, TTP-Detect reframes verification as a relative hypothesis testing problem. It employs a proxy model to amplify watermark-relevant signals and a suite of complementary relative measurements to assess the alignment of the query text with watermarked distributions. Extensive experiments across representative watermarking schemes, datasets and models demonstrate that TTP-Detect achieves superior detection performance and robustness against diverse attacks.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CR'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.CL'/>\n    <published>2026-03-16T08:28:48Z</published>\n    <arxiv:primary_category term='cs.CR'/>\n    <author>\n      <name>Zhuoshang Wang</name>\n    </author>\n    <author>\n      <name>Yubing Ren</name>\n    </author>\n    <author>\n      <name>Yanan Cao</name>\n    </author>\n    <author>\n      <name>Fang Fang</name>\n    </author>\n    <author>\n      <name>Xiaoxue Li</name>\n    </author>\n    <author>\n      <name>Li Guo</name>\n    </author>\n  </entry>"
}