Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

SafeSeek: Universal Attribution of Safety Circuits in Language Models

Mechanistic interpretability reveals that safety-critical behaviors (e.g., alignment, jailbreak, backdoor) in Large Language Models (LLMs) are grounded in specialized functional components. However...

Miao Yu, Siyuan Fu, Moayad Aloqaily, Zhenhong Zhou, Safa Otoum, Xing fan, Kun Wang, Yufei Guo, Qi...

2603.23268 2026-03-24
AI LLM

On the Vulnerability of FHE Computation to Silent Data Corruption

Fully Homomorphic Encryption (FHE) is rapidly emerging as a promising foundation for privacy-preserving cloud services, enabling computation directly on encrypted data. As FHE implementations matur...

Jianan Mu, Ge Yu, Zhaoxuan Kan, Song Bian, Liang Kong, Zizhen Liu, Cheng Liu, Jing Ye, Huawei Li

2603.23253 2026-03-24
AI LLM

AI Lifecycle-Aware Feasibility Framework for Split-RIC Orchestration in NTN O-RAN

Integrating Artificial Intelligence (AI) into Non-Terrestrial Networks (NTN) is constrained by the joint limits of satellite SWaP and feeder-link capacity, which directly impact O-RAN closed-loop c...

Daniele Tarchi

2603.23252 2026-03-24
AI LLM

Is AI Catching Up to Human Expression? Exploring Emotion, Personality, Authorship, and Linguistic Style in English and Arabic with Six Large Language Models

The advancing fluency of LLMs raises important questions about their ability to emulate complex human traits, including emotional expression and personality, across diverse linguistic and cultural ...

Nasser A Alsadhan

2603.23251 2026-03-24
AI LLM

Virtual materials testing of ASSB cathodes combining AI-based stochastic 3D modeling and numerical simulations

The performance of all-solid-state battery (ASSB) cathodes strongly depends on their microstructure. Optimizing the cathode morphology can therefore enhance effective macroscopic properties such as...

Anina Dufter, Sabrina Weber, Orkun Furat, Johannes Schubert, René Rekers, Maximilian Luczak, Erik...

2603.23248 2026-03-24
AI LLM

MemCollab: Cross-Agent Memory Collaboration via Contrastive Trajectory Distillation

Large language model (LLM)-based agents rely on memory mechanisms to reuse knowledge from past problem-solving experiences. Existing approaches typically construct memory in a per-agent manner, tig...

Yurui Chang, Yiran Wu, Qingyun Wu, Lu Lin

2603.23234 2026-03-24
AI LLM

I Came, I Saw, I Explained: Benchmarking Multimodal LLMs on Figurative Meaning in Memes

Internet memes represent a popular form of multimodal online communication and often use figurative elements to convey layered meaning through the combination of text and images. However, it remain...

Shijia Zhou, Saif M. Mohammad, Barbara Plank, Diego Frassinelli

2603.23229 2026-03-24
AI LLM

Decoding AI Authorship: Can LLMs Truly Mimic Human Style Across Literature and Politics?

Amidst the rising capabilities of generative AI to mimic specific human styles, this study investigates the ability of state-of-the-art large language models (LLMs), including GPT-4o, Gemini 1.5 Pr...

Nasser A Alsadhan

2603.23219 2026-03-24
AI LLM

Sparser, Faster, Lighter Transformer Language Models

Scaling autoregressive large language models (LLMs) has driven unprecedented progress but comes with vast computational costs. In this work, we tackle these costs by leveraging unstructured sparsit...

Edoardo Cetin, Stefano Peluchetti, Emilio Castillo, Akira Naruse, Mana Murakami, Llion Jones

2603.23198 2026-03-24
AI LLM

Who Is in the Room? Stakeholder Perspectives on AI Recording in Pediatric Emergency Care

Artificial intelligence systems that record voice and video during pediatric emergencies are emerging as human-computer interaction (HCI) technologies with direct implications for clinical work, ...

Alexandre De Masi, Sergio Manzano, Johan N. Siebert, Frederic Ehrler

2603.23187 2026-03-24
AI LLM

ViKey: Enhancing Temporal Understanding in Videos via Visual Prompting

Recent advancements in Video Large Language Models (VideoLLMs) have enabled strong performance across diverse multimodal video tasks. To reduce the high computational cost of processing dense video...

Yeonkyung Lee, Dayun Ju, Youngmin Kim, Seil Kang, Seong Jae Hwang

2603.23186 2026-03-24
AI LLM

ImplicitRM: Unbiased Reward Modeling from Implicit Preference Data for LLM alignment

Reward modeling represents a long-standing challenge in reinforcement learning from human feedback (RLHF) for aligning language models. Current reward modeling is heavily contingent upon experiment...

Hao Wang, Haocheng Yang, Licheng Pan, Lei Shen, Xiaoxi Li, Yinuo Wang, Zhichao Chen, Yuan Lu, Hao...

2603.23184 2026-03-24
AI LLM

Reasoning over Semantic IDs Enhances Generative Recommendation

Recent advances in generative recommendation have leveraged pretrained LLMs by formulating sequential recommendation as autoregressive generation over a unified token space comprising language toke...

Yingzhi He, Yan Sun, Junfei Tan, Yuxin Chen, Xiaoyu Kong, Chunxu Shen, Xiang Wang, An Zhang, Tat-...

2603.23183 2026-03-24
AI LLM

From Synthetic to Native: Benchmarking Multilingual Intent Classification in Logistics Customer Service

Multilingual intent classification is central to customer-service systems on global logistics platforms, where models must process noisy user queries across languages and hierarchical label spaces....

Haoyu He, Jinyu Zhuang, Haoran Chu, Shuhang Yu, J, T AI Group, Hao Wang, Kunpeng Han

2603.23172 2026-03-24
AI LLM

Robust Safety Monitoring of Language Models via Activation Watermarking

Large language models (LLMs) can be misused to reveal sensitive information, such as weapon-making instructions or writing malware. LLM providers rely on $\emph{monitoring}$ to detect and flag unsa...

Toluwani Aremu, Daniil Ognev, Samuele Poppi, Nils Lukas

2603.23171 2026-03-24
AI LLM

UniDial-EvalKit: A Unified Toolkit for Evaluating Multi-Faceted Conversational Abilities

Benchmarking AI systems in multi-turn interactive scenarios is essential for understanding their practical capabilities in real-world applications. However, existing evaluation protocols are highly...

Qi Jia, Haodong Zhao, Dun Pei, Xiujie Song, Shibo Wang, Zijian Chen, Zicheng Zhang, Xiangyang Zhu...

2603.23160 2026-03-24
AI LLM

Why AI-Generated Text Detection Fails: Evidence from Explainable AI Beyond Benchmark Accuracy

The widespread adoption of Large Language Models (LLMs) has made the detection of AI-Generated text a pressing and complex challenge. Although many detection systems report high benchmark accuracy,...

Shushanta Pudasaini, Luis Miralles-Pechuán, David Lillis, Marisa Llorens Salvador

2603.23146 2026-03-24
AI LLM

Can Language Models Pass Software Testing Certification Exams? a case study

Large Language Models (LLMs) play a pivotal role in both academic research and broader societal applications. LLMs are increasingly used in software testing activities such as test case generation,...

Fitash Ul Haq, Jordi Cabot

2603.23142 2026-03-24
AI LLM

DAK-UCB: Diversity-Aware Prompt Routing for LLMs and Generative Models

The expansion of generative AI and LLM services underscores the growing need for adaptive mechanisms to select an appropriate available model to respond to a user's prompts. Recent works have propo...

Donya Jafari, Farzan Farnia

2603.23140 2026-03-24
AI LLM

HGNet: Scalable Foundation Model for Automated Knowledge Graph Generation from Scientific Literature

Automated knowledge graph (KG) construction is essential for navigating the rapidly expanding body of scientific literature. However, existing approaches struggle to recognize long multi-word entit...

Devvrat Joshi, Islem Rekik

2603.23136 2026-03-24