Papers
Research papers from arXiv and related sources
When Safety Becomes a Vulnerability: Exploiting LLM Alignment Homogeneity for Transferable Blocking in RAG
Retrieval-Augmented Generation (RAG) enhances the capabilities of large language models (LLMs) by incorporating external knowledge, but its reliance on potentially poisonable knowledge bases introd...
Junchen Li, Chao Qi, Rongzheng Wang, Qizhi Chen, Liang Xu, Di Liang, Bob Simons, Shuang Liang
Automated Testbed for Repeatable Evaluation of Ultra-Wideband Localization Performance
Testing Ultra-Wideband (UWB) systems is challenging, as multiple devices need to coordinate over lossy links and the systems' behavior is influenced by timing, synchronization, and environmental fa...
Alexander Kemptner, Julian Karoliny, Hannah Brunner, Andreas Gaich, Michael Neubauer, Fjolla Adem...
Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects
Large language models (LLMs) have demonstrated significant potential in developing Role-Playing Agents (RPAs). However, current research primarily evaluates RPAs using famous fictional characters, ...
Ji-Lun Peng, Yun-Nung Chen
Fast proton transport and neutron production in proton therapy using Fourier neural operators
Objective: Real-time adaptive proton range verification systems based on produced neutrons require accurate information on their non-isotropic momentum distributions within short times, for which M...
Francesco Blangiardi, Hunter N. Ratliff, Fabian Teichert, Kristian Smeland Ytre-Hauge, Jan Langer...
From Threat Intelligence to Firewall Rules: Semantic Relations in Hybrid AI Agent and Expert System Architectures
Web security demands rapid response capabilities to evolving cyber threats. Agentic Artificial Intelligence (AI) promises automation, but the need for trustworthy security responses is of the utmos...
Chiara Bonfanti, Davide Colaiacomo, Luca Cagliero, Cataldo Basile
Asymptotic sharpness of a Nikolskii type inequality for rational functions in the Wiener algebra
We establish the asymptotic sharpness of a Nikolskii type inequality proved by A. Baranov and R. Zarouf for rational functions $f$ in the Wiener algebra of absolutely convergent Fourier series, wit...
Benjamin Auxemery, Alexander Borichev, Rachid Zarouf
Measuring Privacy vs. Fidelity in Synthetic Social Media Datasets
Synthetic data is increasingly used to support research without exposing sensitive user content. Social media data is one of the types of datasets that would hugely benefit from representative synt...
Henry Tari, Adriana Iamnitchi
From Misclassifications to Outliers: Joint Reliability Assessment in Classification
Building reliable classifiers is a fundamental challenge for deploying machine learning in real-world applications. A reliable system should not only detect out-of-distribution (OOD) inputs but als...
Yang Li, Youyang Sha, Yinzhi Wang, Timothy Hospedales, Xi Shen, Shell Xu Hu, Xuanlong Yu
IROSA: Interactive Robot Skill Adaptation using Natural Language
Foundation models have demonstrated impressive capabilities across diverse domains, while imitation learning provides principled methods for robot skill adaptation from limited data. Combining thes...
Markus Knauer, Samuel Bustamante, Thomas Eiband, Alin Albu-Schäffer, Freek Stulp, João Silvério
CzechTopic: A Benchmark for Zero-Shot Topic Localization in Historical Czech Documents
Topic localization aims to identify spans of text that express a given topic defined by a name and description. To study this task, we introduce a human-annotated benchmark based on Czech historica...
Martin Kostelník, Michal Hradiš, Martin Dočekal
On the Suitability of LLM-Driven Agents for Dark Pattern Audits
As LLM-driven agents begin to autonomously navigate the web, their ability to interpret and respond to manipulative interface design becomes critical. A fundamental question that emerges is: can su...
Chen Sun, Yash Vekaria, Rishab Nithyanand
CarbonPATH: Carbon-aware pathfinding and architecture optimization for chiplet-based AI systems
The exponential growth of AI has created unprecedented demand for computational resources, pushing chip designs to the limit while simultaneously escalating the environmental footprint of computing...
Chetan Choppali Sudarshan, Jiajun Hu, Aman Arora, Vidya A. Chhabria
Believe Your Model: Distribution-Guided Confidence Calibration
Large Reasoning Models have demonstrated remarkable performance with the advancement of test-time scaling techniques, which enhances prediction accuracy by generating multiple candidate responses a...
Xizhong Yang, Haotian Zhang, Huiming Wang, Mofei Song
Ising Models of Cooperativity in Muscle Contraction
Regulation of contraction in striated muscle is controlled by a dual mechanism involving both thin filaments containing actin and thick filaments containing myosin. The thin filament is activated b...
Elaheh Saadat, Matthieu Caruel, Stefano Gherardini, Ilaria Morotti, Matteo Marcello, Marco Carema...
Assessing the Effectiveness of LLMs in Delivering Cognitive Behavioral Therapy
As mental health issues continue to rise globally, there is an increasing demand for accessible and scalable therapeutic solutions. Many individuals currently seek support from Large Language Model...
Navdeep Singh Bedi, Ana-Maria Bucur, Noriko Kando, Fabio Crestani
A Robust Compressible APIC/FLIP Particle Grid Method with Conservative Resampling and Adaptive APIC/PIC Blending
Modeling inviscid compressible flows with shocks and vortex dominated dynamics remains challenging for particle grid methods due to moving discontinuities, cell crossing noise, and quadrature degra...
Jiansheng Yao, Yingkui Zhao
A Sensitivity Analysis of Multi-Event Audio Grounding in Audio LLMs
Audio LLMs have shown a strong ability to understand audio samples, yet their reliability in complex acoustic scenes remains under-explored. Unlike prior work limited to small scale or less control...
Taehan Lee, Jaehan Jung, Hyukjun Lee
Benchmarking Motivational Interviewing Competence of Large Language Models
Motivational interviewing (MI) promotes behavioural change in substance use disorders. Its fidelity is measured using the Motivational Interviewing Treatment Integrity (MITI) framework. While large...
Aishwariya Jha, Prakrithi Shivaprakash, Lekhansh Shukla, Animesh Mukherjee, Prabhat Chand, Pratim...
Semantic Bridging Domains: Pseudo-Source as Test-Time Connector
Distribution shifts between training and testing data are a critical bottleneck limiting the practical utility of models, especially in real-world test-time scenarios. To adapt models when the sour...
Xizhong Yang, Huiming Wang, Ning Xu, Mofei Song
All-in-One Image Restoration via Causal-Deconfounding Wavelet-Disentangled Prompt Network
Image restoration represents a promising approach for addressing the inherent defects of image content distortion. Standard image restoration approaches suffer from high storage cost and the requir...
Bingnan Wang, Bin Qin, Jiangmeng Li, Fanjiang Xu, Fuchun Sun, Hui Xiong