Research

Paper

TESTING March 10, 2026

A Semi-spontaneous Dutch Speech Dataset for Speech Enhancement and Speech Recognition

Authors

Dimme de Groot, Yuanyuan Zhang, Jorge Martinez, Odette Scharenborg

Abstract

We present DRES: a 1.5-hour Dutch realistic elicited (semi-spontaneous) speech dataset from 80 speakers recorded in noisy, public indoor environments. DRES was designed as a test set for the evaluation of state-of-the-art (SOTA) automatic speech recognition (ASR) and speech enhancement (SE) models in a real-world scenario: a person speaking in a public indoor space with background talkers and noise. The speech was recorded with a four-channel linear microphone array. In this work we evaluate the speech quality of five well-known single-channel SE algorithms and the recognition performance of eight SOTA off-the-shelf ASR models before and after applying SE on the speech of DRES. We found that five out of the eight ASR models have WERs lower than 22% on DRES, despite the challenging conditions. In contrast to recent work, we did not find a positive effect of modern single-channel SE on ASR performance, emphasizing the importance of evaluating in realistic conditions.

Metadata

arXiv ID: 2603.09725

Provider: ARXIV

Primary Category: eess.AS

Published: 2026-03-10

Fetched: 2026-03-11 06:02

Related papers

Cosmic Shear in Effective Field Theory at Two-Loop Order: Revisiting $S_8$ in Dark Energy Survey Data

Shi-Fan Chen, Joseph DeRose, Mikhail M. Ivanov, Oliver H. E. Philcox • 2026-03-30

Stop Probing, Start Coding: Why Linear Probes and Sparse Autoencoders Fail at Compositional Generalisation

Vitória Barin Pacela, Shruti Joshi, Isabela Camacho, Simon Lacoste-Julien, Da... • 2026-03-30

SNID-SAGE: A Modern Framework for Interactive Supernova Classification and Spectral Analysis

Fiorenzo Stoppa, Stephen J. Smartt • 2026-03-30

Acoustic-to-articulatory Inversion of the Complete Vocal Tract from RT-MRI with Various Audio Embeddings and Dataset Sizes

Sofiane Azzouz, Pierre-André Vuissoz, Yves Laprie • 2026-03-30

Rotating black hole shadows in metric-affine bumblebee gravity

Jose R. Nascimento, Ana R. M. Oliveira, Albert Yu. Petrov, Paulo J. Porfírio,... • 2026-03-30

Raw Data (Debug)

{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.09725v1</id>\n    <title>A Semi-spontaneous Dutch Speech Dataset for Speech Enhancement and Speech Recognition</title>\n    <updated>2026-03-10T14:32:12Z</updated>\n    <link href='https://arxiv.org/abs/2603.09725v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.09725v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>We present DRES: a 1.5-hour Dutch realistic elicited (semi-spontaneous) speech dataset from 80 speakers recorded in noisy, public indoor environments. DRES was designed as a test set for the evaluation of state-of-the-art (SOTA) automatic speech recognition (ASR) and speech enhancement (SE) models in a real-world scenario: a person speaking in a public indoor space with background talkers and noise. The speech was recorded with a four-channel linear microphone array. In this work we evaluate the speech quality of five well-known single-channel SE algorithms and the recognition performance of eight SOTA off-the-shelf ASR models before and after applying SE on the speech of DRES. We found that five out of the eight ASR models have WERs lower than 22% on DRES, despite the challenging conditions. In contrast to recent work, we did not find a positive effect of modern single-channel SE on ASR performance, emphasizing the importance of evaluating in realistic conditions.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='eess.AS'/>\n    <published>2026-03-10T14:32:12Z</published>\n    <arxiv:comment>Submitted to Interspeech 2026</arxiv:comment>\n    <arxiv:primary_category term='eess.AS'/>\n    <author>\n      <name>Dimme de Groot</name>\n    </author>\n    <author>\n      <name>Yuanyuan Zhang</name>\n    </author>\n    <author>\n      <name>Jorge Martinez</name>\n    </author>\n    <author>\n      <name>Odette Scharenborg</name>\n    </author>\n  </entry>"
}