Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
TESTING

Test-Time Attention Purification for Backdoored Large Vision Language Models

Despite the strong multimodal performance, large vision-language models (LVLMs) are vulnerable during fine-tuning to backdoor attacks, where adversaries insert trigger-embedded samples into the tra...

Zhifang Zhang, Bojun Yang, Shuo He, Weitong Chen, Wei Emma Zhang, Olaf Maennel, Lei Feng, Miao Xu

2603.12989 2026-03-13
TESTING

Fair Lung Disease Diagnosis from Chest CT via Gender-Adversarial Attention Multiple Instance Learning

We present a fairness-aware framework for multi-class lung disease diagnosis from chest CT volumes, developed for the Fair Disease Diagnosis Challenge at the PHAROS-AIF-MIH Workshop (CVPR 2026). Th...

Aditya Parikh, Aasa Feragen

2603.12988 2026-03-13
AI LLM

Is Human Annotation Necessary? Iterative MBR Distillation for Error Span Detection in Machine Translation

Error Span Detection (ESD) is a crucial subtask in Machine Translation (MT) evaluation, aiming to identify the location and severity of translation errors. While fine-tuning models on human-annotat...

Boxuan Lyu, Haiyue Song, Zhi Qu

2603.12983 2026-03-13
TESTING

On the timescales of controlled termination of tokamak plasmas

The RAPTOR code is used to model how the time required for controlled termination of Ohmic plasmas scales from present tokamaks like TCV and JET, to reactor-grade tokamaks like ITER and DEMO. We sh...

Simon Van Mulders, Olivier Sauter

2603.12972 2026-03-13
AI LLM

Generative Horcrux: Designing AI Carriers for Afterlife Selves

As generative AI technologies rapidly advance, AI agents are gaining the ability not only to collect data and perform tasks but also to respond to environments and evolve over time. This shift open...

Zhen-Chi Lai, Yu-Ting Cheng, Pei-Ying Lin, Chiao-Wei Ho, Janet Yi-Ching Huang

2603.12971 2026-03-13
TESTING

Tied-array beam flatfielding

Context. Multi-element phased-array radio telescopes use digital beamforming to widen their field-of-view with numerous tied-array beams (TABs). These beams share bandpass variations and radio freq...

Dirk Kuiper, Cees Bassa, Ziggy Pleunis, Jason Hessels

2603.12970 2026-03-13
TESTING

Long-form RewardBench: Evaluating Reward Models for Long-form Generation

The widespread adoption of reinforcement learning-based alignment highlights the growing importance of reward models. Various benchmarks have been built to evaluate reward models in various domains...

Hui Huang, Yancheng He, Wei Liu, Muyun Yang, Jiaheng Liu, Kehai Chen, Bing Xu, Conghui Zhu, Hailo...

2603.12963 2026-03-13
TESTING

Recent electroweak measurements from the CMS experiment

Recent measurements of electroweak phenomena from the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider are summarized. The standard model of particle physics was tested through h...

Cristina-Andreea Alexe

2603.12961 2026-03-13
AI LLM

Delta1 with LLM: symbolic and neural integration for credible and explainable reasoning

Neuro-symbolic reasoning increasingly demands frameworks that unite the formal rigor of logic with the interpretability of large language models (LLMs). We introduce an end to end explainability by...

Yang Xu, Jun Liu, Shuwei Chen, Chris Nugent, Hailing Guo

2603.12953 2026-03-13
AI LLM

MotionAnymesh: Physics-Grounded Articulation for Simulation-Ready Digital Twins

Converting static 3D meshes into interactable articulated assets is crucial for embodied AI and robotic simulation. However, existing zero-shot pipelines struggle with complex assets due to a criti...

WenBo Xu, Liu Liu, Li Zhang, Dan Guo, RuoNan Liu

2603.12936 2026-03-13
AI LLM

Can Fairness Be Prompted? Prompt-Based Debiasing Strategies in High-Stakes Recommendations

Large Language Models (LLMs) can infer sensitive attributes such as gender or age from indirect cues like names and pronouns, potentially biasing recommendations. While several debiasing methods ex...

Mihaela Rotar, Theresia Veronika Rampisela, Maria Maistro

2603.12935 2026-03-13
AI LLM

Photonic Exponential Approximation via Cascaded TFLN Microring Resonators toward Softmax

The rapid growth of large-scale AI models has intensified energy consumption and data-movement challenges in modern datacenters. Photonic accelerators offer a promising path by executing the line...

Hyoseok Park, Yeonsang Park

2603.12934 2026-03-13
AI LLM

Efficient and Interpretable Multi-Agent LLM Routing via Ant Colony Optimization

Large Language Model (LLM)-driven Multi-Agent Systems (MAS) have demonstrated strong capability in complex reasoning and tool use, and heterogeneous agent pools further broaden the quality--cost tr...

Xudong Wang, Chaoning Zhang, Jiaquan Zhang, Chenghao Li, Qigan Sun, Sung-Ho Bae, Peng Wang, Ning ...

2603.12933 2026-03-13
AI LLM

DS$^2$-Instruct: Domain-Specific Data Synthesis for Large Language Models Instruction Tuning

Adapting Large Language Models (LLMs) to specialized domains requires high-quality instruction tuning datasets, which are expensive to create through human annotation. Existing data synthesis metho...

Ruiyao Xu, Noelle I. Samia, Han Liu

2603.12932 2026-03-13
AI LLM

Teaching Agile Requirements Engineering: A Stakeholder Simulation with Generative AI

Context: The active involvement of users and customers in agile software development remains a persistent challenge in practice. For this reason, it is important that students in higher education b...

Eva-Maria Schön, Michael Neumann, Tiago Silva da Silva

2603.12925 2026-03-13
TESTING

DirPA: Addressing Prior Shift in Imbalanced Few-shot Crop-type Classification

Real-world agricultural monitoring is often hampered by severe class imbalance and high label acquisition costs, resulting in significant data scarcity. In few-shot learning (FSL) -- a framework sp...

Joana Reuss, Ekaterina Gikalo, Marco Körner

2603.12905 2026-03-13
AI LLM

Human-Centered Evaluation of an LLM-Based Process Modeling Copilot: A Mixed-Methods Study with Domain Experts

Integrating Large Language Models (LLMs) into business process management tools promises to democratize Business Process Model and Notation (BPMN) modeling for non-experts. While automated framewor...

Chantale Lauer, Peter Pfeiffer, Nijat Mehdiyev

2603.12895 2026-03-13
TESTING

Development of a Methodology for the Automated Spatial Mapping of Heterogeneous Elastoplastic Properties of Welded Joints

Knowledge of the mechanical properties of materials is required for the design and analysis of engineering products, however, the characterisation of heterogeneous properties using traditional tech...

Robert Hamill, Allan Harte, Aleksander Marek, Fabrice Pierron

2603.12892 2026-03-13
TESTING

A protocol for evaluating robustness to H&E staining variation in computational pathology models

Sensitivity to staining variation remains a major barrier to deploying computational pathology (CPath) models as hematoxylin and eosin (H&E) staining varies across laboratories, requiring systemati...

Lydia A. Schönpflug, Nikki van den Berg, Sonali Andani, Nanda Horeweg, Jurriaan Barkey Wolf, Tjal...

2603.12886 2026-03-13
AI LLM

Enhanced Drug-drug Interaction Prediction Using Adaptive Knowledge Integration

Drug-drug interaction event (DDIE) prediction is crucial for preventing adverse reactions and ensuring optimal therapeutic outcomes. However, existing methods often face challenges with imbalanced ...

Pengfei Liu, Jun Tao, Zhixiang Ren

2603.12885 2026-03-13