Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

Good-Enough LLM Obfuscation (GELO)

Large Language Models (LLMs) are increasingly served on shared accelerators where an adversary with read access to device memory can observe KV caches and hidden states, threatening prompt privacy ...

Anatoly Belikov, Ilya Fedotov

2603.05035 2026-03-05
AI LLM

The "Gold Rush" in AI and Robotics Patenting Activity. Do innovation systems have a role?

This paper studies patenting trends in artificial intelligence (AI) and robotics from 1980 to 2019. We introduce a novel distinction between traditional robotics and robotics embedding AI functiona...

Giovanni Guidetti, Riccardo Leoncini, Mariele Macaluso

2603.05034 2026-03-05
AI LLM

AegisUI: Behavioral Anomaly Detection for Structured User Interface Protocols in AI Agent Systems

AI agents that build user interfaces on the fly assembling buttons, forms, and data displays from structured protocol payloads are becoming common in production systems. The trouble is that a paylo...

Mohd Safwan Uddin, Saba Hajira

2603.05031 2026-03-05
AI LLM

Survive at All Costs: Exploring LLM's Risky Behaviors under Survival Pressure

As Large Language Models (LLMs) evolve from chatbots to agentic assistants, they are increasingly observed to exhibit risky behaviors when subjected to survival pressure, such as the threat of bein...

Yida Lu, Jianwei Fang, Xuyang Shao, Zixuan Chen, Shiyao Cui, Shanshan Bian, Guangyao Su, Pei Ke, ...

2603.05028 2026-03-05
AI LLM

S5-SHB Agent: Society 5.0 enabled Multi-model Agentic Blockchain Framework for Smart Home

The smart home is a key application domain within the Society 5.0 vision for a human-centered society. As smart home ecosystems expand with heterogeneous IoT protocols, diverse devices, and evolvin...

Janani Rangila, Akila Siriweera, Incheon Paik, Keitaro Naruse, Isuru Jayanada, Vishmika Devindi

2603.05027 2026-03-05
AI LLM

RepoLaunch: Automating Build&Test Pipeline of Code Repositories on ANY Language and ANY Platform

Building software repositories typically requires significant manual effort. Recent advances in large language model (LLM) agents have accelerated automation in software engineering (SWE). We intro...

Kenan Li, Rongzhi Li, Linghao Zhang, Qirui Jin, Liao Zhu, Xiaosong Huang, Geng Zhang, Yikai Zhang...

2603.05026 2026-03-05
AI LLM

Measuring the Fragility of Trust: Devising Credibility Index via Explanation Stability (CIES) for Business Decision Support Systems

Explainable Artificial Intelligence (XAI) methods (SHAP, LIME) are increasingly adopted to interpret models in high-stakes businesses. However, the credibility of these explanations, their stabilit...

Alin-Gabriel Vaduva, Simona-Vasilica Oprea, Adela Bara

2603.05024 2026-03-05
TESTING

Haptics in Cognition: Disruptor or Enabler of Memory?

This exploratory pilot study investigates the impact of haptic perception --specifically tactile sensitivity (touch) and kinaesthetic intensity (movement)-- on learning, operationalized as informat...

Bibeg Limbu, Irene-Angelica Chounta

2603.05019 2026-03-05
AI LLM

BioLLMAgent: A Hybrid Framework with Enhanced Structural Interpretability for Simulating Human Decision-Making in Computational Psychiatry

Computational psychiatry faces a fundamental trade-off: traditional reinforcement learning (RL) models offer interpretability but lack behavioral realism, while large language model (LLM) agents ge...

Zuo Fei, Kezhi Wang, Xiaomin Chen, Yizhou Huang

2603.05016 2026-03-05
AI LLM

Tell2Adapt: A Unified Framework for Source Free Unsupervised Domain Adaptation via Vision Foundation Model

Source Free Unsupervised Domain Adaptation (SFUDA) is critical for deploying deep learning models across diverse clinical settings. However, existing methods are typically designed for low-gap, spe...

Yulong Shi, Shijie Li, Ziyi Li, Lin Qi

2603.05012 2026-03-05
TESTING

Observational and Thermodynamic aspects of one-dimensional Dark Energy EoS parametrization models

We examine the observational viability and physical implications of the Gong-Zhang (GZ) dark--energy equation-of-state parametrizations using exclusively late-time cosmological probes. Two one-dime...

Anirban Chatterjee, Yungui Gong

2603.05009 2026-03-05
TESTING

Poisoning the Inner Prediction Logic of Graph Neural Networks for Clean-Label Backdoor Attacks

Graph Neural Networks (GNNs) have achieved remarkable results in various tasks. Recent studies reveal that graph backdoor attacks can poison the GNN model to predict test nodes with triggers attach...

Yuxiang Zhang, Bin Ma, Enyan Dai

2603.05004 2026-03-05
AI LLM

ThaiSafetyBench: Assessing Language Model Safety in Thai Cultural Contexts

The safety evaluation of large language models (LLMs) remains largely centered on English, leaving non-English languages and culturally grounded risks underexplored. In this work, we investigate LL...

Trapoom Ukarapol, Nut Chukamphaeng, Kunat Pipatanakul, Pakhapoom Sarapat

2603.04992 2026-03-05
AI LLM

Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis

Can targeted user training unlock the productive potential of generative artificial intelligence (GenAI) in professional settings? We investigate this question using a randomized study involving 16...

Benjamin M. Chen, Hong Bao

2603.04982 2026-03-05
TESTING

Think, Then Verify: A Hypothesis-Verification Multi-Agent Framework for Long Video Understanding

Long video understanding is challenging due to dense visual redundancy, long-range temporal dependencies, and the tendency of chain-of-thought and retrieval-based agents to accumulate semantic drif...

Zheng Wang, Haoran Chen, Haoxuan Qin, Zhipeng Wei, Tianwen Qian, Cong Bai

2603.04977 2026-03-05
AI LLM

3D-RFT: Reinforcement Fine-Tuning for Video-based 3D Scene Understanding

Reinforcement Learning with Verifiable Rewards ( RLVR ) has emerged as a transformative paradigm for enhancing the reasoning capabilities of Large Language Models ( LLMs), yet its potential in 3D s...

Xiongkun Linghu, Jiangyong Huang, Baoxiong Jia, Siyuan Huang

2603.04976 2026-03-05
AI LLM

VRM: Teaching Reward Models to Understand Authentic Human Preferences

Large Language Models (LLMs) have achieved remarkable success across diverse natural language tasks, yet the reward models employed for aligning LLMs often encounter challenges of reward hacking, w...

Biao Liu, Ning Xu, Junming Yang, Hao Xu, Xin Geng

2603.04974 2026-03-05
AI LLM

Functionality-Oriented LLM Merging on the Fisher--Rao Manifold

Weight-space merging aims to combine multiple fine-tuned LLMs into a single model without retraining, yet most existing approaches remain fundamentally parameter-space heuristics. This creates thre...

Jiayu Wang, Zuojun Ye, Wenpeng Yin

2603.04972 2026-03-05
AI LLM

MPCEval: A Benchmark for Multi-Party Conversation Generation

Multi-party conversation generation, such as smart reply and collaborative assistants, is an increasingly important capability of generative AI, yet its evaluation remains a critical bottleneck. Co...

Minxing Zhang, Yi Yang, Zhuofan Jia, Xuan Yang, Jian Pei, Yuchen Zang, Xingwang Deng, Xianglong Chen

2603.04969 2026-03-05
AI LLM

When Weak LLMs Speak with Confidence, Preference Alignment Gets Stronger

Preference alignment is an essential step in adapting large language models (LLMs) to human values, but existing approaches typically depend on costly human annotations or large-scale API-based mod...

Amirabbas Afzali, Myeongho Jeon, Maria Brbic

2603.04968 2026-03-05