Paper
LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance
Authors
Ioannis Prokopiou, Ioannis Sina, Agisilaos Kounelis, Pantelis Vikatos, Themos Stafylakis
Abstract
The advancement of Machine learning (ML), Large Audio Language Models (LALMs), and autonomous AI agents in Music Information Retrieval (MIR) necessitates a shift from static tagging to rich, human-aligned representation learning. However, the scarcity of open-source infrastructure capable of capturing the subjective nuances of audio annotation remains a critical bottleneck. This paper introduces \textbf{LabelBuddy}, an open-source collaborative auto-tagging audio annotation tool designed to bridge the gap between human intent and machine understanding. Unlike static tools, it decouples the interface from inference via containerized backends, allowing users to plug in custom models for AI-assisted pre-annotation. We describe the system architecture, which supports multi-user consensus, containerized model isolation, and a roadmap for extending agents and LALMs. Code available at https://github.com/GiannisProkopiou/gsoc2022-Label-buddy.
Metadata
Related papers
Gen-Searcher: Reinforcing Agentic Search for Image Generation
Kaituo Feng, Manyuan Zhang, Shuang Chen, Yunlong Lin, Kaixuan Fan, Yilei Jian... • 2026-03-30
On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers
Omer Dahary, Benaya Koren, Daniel Garibi, Daniel Cohen-Or • 2026-03-30
Graphilosophy: Graph-Based Digital Humanities Computing with The Four Books
Minh-Thu Do, Quynh-Chau Le-Tran, Duc-Duy Nguyen-Mai, Thien-Trang Nguyen, Khan... • 2026-03-30
ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining
Anuj Diwan, Eunsol Choi, David Harwath • 2026-03-30
RAD-AI: Rethinking Architecture Documentation for AI-Augmented Ecosystems
Oliver Aleksander Larsen, Mahyar T. Moghaddam • 2026-03-30
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2603.04293v1</id>\n <title>LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance</title>\n <updated>2026-03-04T17:08:50Z</updated>\n <link href='https://arxiv.org/abs/2603.04293v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2603.04293v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>The advancement of Machine learning (ML), Large Audio Language Models (LALMs), and autonomous AI agents in Music Information Retrieval (MIR) necessitates a shift from static tagging to rich, human-aligned representation learning. However, the scarcity of open-source infrastructure capable of capturing the subjective nuances of audio annotation remains a critical bottleneck. This paper introduces \\textbf{LabelBuddy}, an open-source collaborative auto-tagging audio annotation tool designed to bridge the gap between human intent and machine understanding. Unlike static tools, it decouples the interface from inference via containerized backends, allowing users to plug in custom models for AI-assisted pre-annotation. We describe the system architecture, which supports multi-user consensus, containerized model isolation, and a roadmap for extending agents and LALMs. Code available at https://github.com/GiannisProkopiou/gsoc2022-Label-buddy.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.SD'/>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.AI'/>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.IR'/>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.LG'/>\n <published>2026-03-04T17:08:50Z</published>\n <arxiv:comment>Accepted at NLP4MusA 2026 (4th Workshop on NLP for Music and Audio)</arxiv:comment>\n <arxiv:primary_category term='cs.SD'/>\n <author>\n <name>Ioannis Prokopiou</name>\n </author>\n <author>\n <name>Ioannis Sina</name>\n </author>\n <author>\n <name>Agisilaos Kounelis</name>\n </author>\n <author>\n <name>Pantelis Vikatos</name>\n </author>\n <author>\n <name>Themos Stafylakis</name>\n </author>\n </entry>"
}