Papers
Research papers from arXiv and related sources
MedCausalX: Adaptive Causal Reasoning with Self-Reflection for Trustworthy Medical Vision-Language Models
Vision-Language Models (VLMs) have enabled interpretable medical diagnosis by integrating visual perception with linguistic reasoning. Yet, existing medical chain-of-thought (CoT) models lack expli...
Jianxin Lin, Chunzheng Zhu, Peter J. Kneuertz, Yunfei Bai, Yuan Xue
Amplitude Analysis of the Isospin-Violating Decay $J/ψ\rightarrowγηπ^{0}$
Using $(10087 \pm 44)\times 10^{6}$ $\jpsi$ events collected with the BESIII detector, we perform the first amplitude analysis of the process $\jpsi\toγη\piz$. The decay is dominated by the interme...
BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Alibert...
MsFormer: Enabling Robust Predictive Maintenance Services for Industrial Devices
Providing reliable predictive maintenance is a critical industrial AI service essential for ensuring the high availability of manufacturing devices. Existing deep-learning methods present competiti...
Jiahui Zhou, Dan Li, Ruibing Jin, Jian Lou, Yanran Zhao, Zhenghua Chen, Zigui Jiang, See-Kiong Ng
Good for the Planet, Bad for Me? Intended and Unintended Consequences of AI Energy Consumption Disclosure
To address the high energy consumption of artificial intelligence, energy consumption disclosure (ECD) has been proposed to steer users toward more sustainable practices, such as choosing efficient...
Michael Klesel, Uwe Messer
Can an LLM Detect Instances of Microservice Infrastructure Patterns?
Architectural patterns are frequently found in various software artifacts. The wide variety of patterns and their implementations makes detection challenging with current tools, especially since th...
Carlos Eduardo Duarte, Neil B. Harrison, Filipe Figueiredo Correia, Ademar Aguiar, Pavlína Gonçalves
MLLM-HWSI: A Multimodal Large Language Model for Hierarchical Whole Slide Image Understanding
Whole Slide Images (WSIs) exhibit hierarchical structure, where diagnostic information emerges from cellular morphology, regional tissue organization, and global context. Existing Computational Pat...
Basit Alawode, Arif Mahmood, Muaz Khalifa Al-Radi, Shahad Albastaki, Asim Khan, Muhammad Bilal, M...
Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution
We identify a critical security vulnerability in mainstream Claw personal AI agents: untrusted content encountered during heartbeat-driven background execution can silently pollute agent memory and...
Yechao Zhang, Shiqian Zhao, Jie Zhang, Gelei Deng, Jiawen Zhang, Xiaogeng Liu, Chaowei Xiao, Tian...
Minibal: Balanced Game-Playing Without Opponent Modeling
Recent advances in game AI, such as AlphaZero and Athénan, have achieved superhuman performance across a wide range of board games. While highly powerful, these agents are ill-suited for human-AI i...
Quentin Cohen-Solal, Tristan Cazenave
Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition
Audio-Language Models (ALMs) are making strides in understanding speech and non-speech audio. However, domain-specialist Foundation Models (FMs) remain the best for closed-ended speech processing t...
Saurabh Kataria, Xiao Hu
Post-Selection Distributional Model Evaluation
Formal model evaluation methods typically certify that a model satisfies a prescribed target key performance indicator (KPI) level. However, in many applications, the relevant target KPI level may ...
Amirmohammad Farzaneh, Osvaldo Simeone
DBAutoDoc: Automated Discovery and Documentation of Undocumented Database Schemas via Statistical Analysis and Iterative LLM Refinement
A tremendous number of critical database systems lack adequate documentation. Declared primary keys are absent, foreign key constraints have been dropped for performance, column names are cryptic a...
Amith Nagarajan, Thomas Altman
PCR: A Prefetch-Enhanced Cache Reuse System for Low-Latency RAG Serving
Retrieval-Augmented Generation (RAG) systems enhance the performance of large language models (LLMs) by incorporating supplementary retrieved documents, enabling more accurate and context-aware res...
Wenfeng Wang, Xiaofeng Hou, Peng Tang, Hengyi Zhou, Jing Wang, Xinkai Wang, Chao Li, Minyi Guo
Parametric Knowledge and Retrieval Behavior in RAG Fine-Tuning for Electronic Design Automation
Retrieval-Augmented Generation (RAG) fine-tuning has shown substantial improvements over vanilla RAG, yet most studies target document question answering and often rely on standard NLP metrics that...
Julian Oestreich, Maximilian Bley, Frank Binder, Lydia Müller, Maksym Sydorenko, André Alcalde
HUydra: Full-Range Lung CT Synthesis via Multiple HU Interval Generative Modelling
Currently, a central challenge and bottleneck in the deployment and validation of computer-aided diagnosis (CAD) models within the field of medical imaging is data scarcity. For lung cancer, one of...
António Cardoso, Pedro Sousa, Tania Pereira, Hélder P. Oliveira
Light fermionic dark matter window in the scotogenic inverse seesaw model
The origin of neutrino mass and the nature of dark matter (DM) remain unresolved puzzles in particle physics, and an appealing possibility is to address both in a unified picture. This paper explor...
Huan-Can Liang, Yi Liao, Xiao-Dong Ma, Mu-Yuan Song, Hao-Lin Wang
YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable object detection and trustworthy multimodal AI in computer vision perception
The interpretable object detection capabilities of a novel Kolmogorov-Arnold network framework are examined here. The approach refers to a key limitation in computer vision for autonomous vehicles ...
Marios Impraimakis, Daniel Vazquez, Feiyu Zhou
Traffic Sign Recognition in Autonomous Driving: Dataset, Benchmark, and Field Experiment
Traffic Sign Recognition (TSR) is a core perception capability for autonomous driving, where robustness to cross-region variation, long-tailed categories, and semantic ambiguity is essential for re...
Guoyang Zhao, Weiqing Qi, Kai Zhang, Chenguang Zhang, Zeying Gong, Zhihai Bi, Kai Chen, Benshan M...
Modelling Emotions is an Elusive Pursuit in Affective Computing
Affective computing - combining sensor technology, machine learning, and psychology - have been studied for over three decades and is employed in AI-powered technologies to enhance emotional awaren...
Anders Rolighed Larsen, Sneha Das, Line Clemmensen
Knowledge Access Beats Model Size: Memory Augmented Routing for Persistent AI Agents
Production AI agents frequently receive user-specific queries that are highly repetitive, with up to 47\% being semantically similar to prior interactions, yet each query is typically processed wit...
Xunzhuo Liu, Bowei He, Xue Liu, Andy Luo, Haichen Zhang, Huamin Chen
On the Suboptimality of Rate--Distortion-Optimal Compression: Fundamental Accuracy Limits for Distributed Localization
We derive fundamental accuracy limits for distributed localization when a fusion center has access only to independently rate-distortion (RD)-optimally compressed versions of multi-sensor observati...
Amir Weiss