Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

LLMTailor: A Layer-wise Tailoring Tool for Efficient Checkpointing of Large Language Models

Checkpointing is essential for fault tolerance in training large language models (LLMs). However, existing methods, regardless of their I/O strategies, periodically store the entire model and optim...

Minqiu Sun, Xin Huang, Luanzheng Guo, Nathan R. Tallent, Kento Sato, Dong Dai

2602.22158 2026-02-25
AI LLM

Dynamic Personality Adaptation in Large Language Models via State Machines

The inability of Large Language Models (LLMs) to modulate their personality expression in response to evolving dialogue dynamics hinders their performance in complex, interactive contexts. We propo...

Leon Pielage, Ole Hätscher, Mitja Back, Bernhard Marschall, Benjamin Risse

2602.22157 2026-02-25
TESTING

Enhancing Framingham Cardiovascular Risk Score Transparency through Logic-Based XAI

Cardiovascular disease (CVD) remains one of the leading global health challenges, accounting for more than 19 million deaths worldwide. To address this, several tools that aim to predict CVD risk a...

Emannuel L. de A. Bezerra, Luiz H. T. Viana, Vinícius P. Chagas, Diogo E. Rolim, Thiago Alves Roc...

2602.22149 2026-02-25
AI LLM

Provable Last-Iterate Convergence for Multi-Objective Safe LLM Alignment via Optimistic Primal-Dual

Reinforcement Learning from Human Feedback (RLHF) plays a significant role in aligning Large Language Models (LLMs) with human preferences. While RLHF with expected reward constraints can be formul...

Yining Li, Peizhong Ju, Ness Shroff

2602.22146 2026-02-25
AI LLM

When AI Writes, Whose Voice Remains? Quantifying Cultural Marker Erasure Across World English Varieties in Large Language Models

Large Language Models (LLMs) are increasingly used to ``professionalize'' workplace communication, often at the cost of linguistic identity. We introduce "Cultural Ghosting", the systematic erasure...

Satyam Kumar Navneet, Joydeep Chandra, Yong Zhang

2602.22145 2026-02-25
AI LLM

WeaveTime: Stream from Earlier Frames into Emergent Memory in VideoLLMs

Recent advances in Multimodal Large Language Models have greatly improved visual understanding and reasoning, yet their quadratic attention and offline training protocols make them ill-suited for s...

Yulin Zhang, Cheng Shi, Sibei Yang

2602.22142 2026-02-25
AI LLM

Secure Semantic Communications via AI Defenses: Fundamentals, Solutions, and Future Directions

Semantic communication (SemCom) redefines wireless communication from reproducing symbols to transmitting task-relevant semantics. However, this AI-native architecture also introduces new vulnerabi...

Lan Zhang, Chengsi Liang, Zeming Zhuang, Yao Sun, Fang Fang, Xiaoyong Yuan, Dusit Niyato

2602.22134 2026-02-25
AI LLM

IndicIFEval: A Benchmark for Verifiable Instruction-Following Evaluation in 14 Indic Languages

Instruction-following benchmarks remain predominantly English-centric, leaving a critical evaluation gap for the hundreds of millions of Indic language speakers. We introduce IndicIFEval, a benchma...

Thanmay Jayakumar, Mohammed Safi Ur Rahman Khan, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan

2602.22125 2026-02-25
TESTING

Searches for new physics beyond the Standard Model in hyperon sector

Hyperon physics offers a distinctive laboratory for probing the intensity frontier and searching for physics beyond the Standard Model. This review summarizes recent results from the BESIII experim...

Jianyu Zhang, Jinlin Fu, Hai-Bo Li

2602.22119 2026-02-25
TESTING

Don't stop me now: Rethinking Validation Criteria for Model Parameter Selection

Despite the extensive literature on training loss functions, the evaluation of generalization on the validation set remains underexplored. In this work, we conduct a systematic empirical and statis...

Andrea Apicella, Francesco Isgrò, Andrea Pollastro, Roberto Prevete

2602.22107 2026-02-25
TESTING

PASTA: A Modular Program Analysis Tool Framework for Accelerators

The increasing complexity and diversity of hardware accelerators in modern computing systems demand flexible, low-overhead program analysis tools. We present PASTA, a low-overhead and modular Progr...

Mao Lin, Hyeran Jeon, Keren Zhou

2602.22103 2026-02-25
TESTING

Petri Net Relaxation for Infeasibility Explanation and Sequential Task Planning

Plans often change due to changes in the situation or our understanding of the situation. Sometimes, a feasible plan may not even exist, and identifying such infeasibilities is useful to determine ...

Nguyen Cong Nhat Le, John G. Rogers, Claire N. Bonial, Neil T. Dantam

2602.22094 2026-02-25
AI LLM

Confidence-Driven Multi-Scale Model Selection for Cost-Efficient Inference

Large Language Models (LLMs) have revolutionized inference across diverse natural language tasks, with larger models performing better but at higher computational costs. We propose a confidence-dri...

Bo-Wei Chen, Chung-Chi Chen, An-Zi Yen

2602.22090 2026-02-25
AI LLM

Transmission Delay Minimization for NOMA-Based F-RANs

A novel non-orthogonal multiple access (NOMA) based low-delay service framework is proposed for fog radio access networks (F-RANs). Fog access points (FAPs) leverage NOMA for local delivery of cach...

Yuan Ai, Xidong Mu, Pengbo Si, Yuanwei Liu

2602.22087 2026-02-25
AI LLM

ViSTAR: Virtual Skill Training with Augmented Reality with 3D Avatars and LLM coaching agent

We present ViSTAR, a Virtual Skill Training system in AR that supports self-guided basketball skill practice, with feedback on balance, posture, and timing. From a formative study with basketball p...

Chunggi Lee, Hayato Saiki, Tica Lin, Eiji Ikeda, Kenji Suzuki, Chen Zhu-Tian, Hanspeter Pfister

2602.22077 2026-02-25
TESTING

RustyDL: A Program Logic for Rust

Rust is a modern programming language that guarantees memory safety and the absence of data races with a strong type system. We present RustyDL, a program logic for Rust, as a foundation for an aut...

Daniel Drodt, Reiner Hähnle

2602.22075 2026-02-25
AI LLM

Understanding Artificial Theory of Mind: Perturbed Tasks and Reasoning in Large Language Models

Theory of Mind (ToM) refers to an agent's ability to model the internal states of others. Contributing to the debate whether large language models (LLMs) exhibit genuine ToM capabilities, our study...

Christian Nickel, Laura Schrewe, Florian Mai, Lucie Flek

2602.22072 2026-02-25
AI LLM

Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts

Large language models are increasingly used in decision-making tasks that require them to process information from a variety of sources, including both human experts and other algorithmic agents. H...

Jessica Y. Bo, Lillio Mok, Ashton Anderson

2602.22070 2026-02-25
TESTING

Pools as Portfolios: Observed arbitrage efficiency & LVR analysis of dynamic weight AMMs

Dynamic-weight AMMs (aka Temporal Function Market Makers, TFMMs) implement algorithmic asset allocation, analogous to index or smart beta funds, by continuously updating pools' weights. A strategy ...

Matthew Willetts, Christian Harrington

2602.22069 2026-02-25
AI LLM

Semantic Partial Grounding via LLMs

Grounding is a critical step in classical planning, yet it often becomes a computational bottleneck due to the exponential growth in grounded actions and atoms as task size increases. Recent advanc...

Giuseppe Canonaco, Alberto Pozanco, Daniel Borrajo

2602.22067 2026-02-25