Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

"I Should Know, But I Dare Not Ask": From Understanding Challenges in Patient Journeys to Deriving Design Implications for North Korean Defectors' Adaptation

While it is known that North Korean defectors (NKDs) struggle with South Korea's healthcare system, the specific challenges of their patient journey remain underexplored. To investigate this, we co...

Hyungwoo Song, Jeongha Kim, Minju Kim, Duhyung Kwak, Minjeong Shin, Bongwon suh, Hyunggu Jung

2603.12632 2026-03-13
AI LLM

Collaborative Multi-Agent Optimization for Personalized Memory System

Memory systems are crucial to personalized LLMs by mitigating the context window limitation in capturing long-term user-LLM conversations. Typically, such systems leverage multiple agents to handle...

Wenyu Mao, Haoyang Liu, Zhao Liu, Haosong Tan, Yaorui Shi, Jiancan Wu, An Zhang, Xiang Wang

2603.12631 2026-03-13
AI LLM

The Economics of AI Supply Chain Regulation

The rise of foundation models has driven the emergence of AI supply chains, where upstream foundation model providers offer fine-tuning and inference services to downstream firms developing domain-...

Sihan Qian, Amit Mehra, Dengpan Liu

2603.12630 2026-03-13
AI LLM

Towards unified brain-to-text decoding across speech production and perception

Speech production and perception are the main ways humans communicate daily. Prior brain-to-text decoding studies have largely focused on a single modality and alphabetic languages. Here, we presen...

Zhizhang Yuan, Yang Yang, Gaorui Zhang, Baowen Cheng, Zehan Wu, Yuhao Xu, Xiaoying Liu, Liang Che...

2603.12628 2026-03-13
AI LLM

AEGIS: No Tool Call Left Unchecked -- A Pre-Execution Firewall and Audit Layer for AI Agents

AI agents increasingly act through external tools: they query databases, execute shell commands, read and write files, and send network requests. Yet in most current agent stacks, model-generated t...

Aojie Yuan, Zhiyuan Su, Yue Zhao

2603.12621 2026-03-13
AI LLM

Human-AI Collaborative Autonomous Experimentation With Proxy Modeling for Comparative Observation

Optimization for different tasks like material characterization, synthesis, and functional properties for desired applications over multi-dimensional control parameters need a rapid strategic searc...

Arpan Biswas, Hiroshi Funakubo, Yongtao Liu

2603.12618 2026-03-13
AI LLM

Literary Narrative as Moral Probe : A Cross-System Framework for Evaluating AI Ethical Reasoning and Refusal Behavior

Existing AI moral evaluation frameworks test for the production of correct-sounding ethical responses rather than the presence of genuine moral reasoning capacity. This paper introduces a novel pro...

David C. Flynn

2603.12615 2026-03-13
AI LLM

ChainFuzzer: Greybox Fuzzing for Workflow-Level Multi-Tool Vulnerabilities in LLM Agents

Tool-augmented LLM agents increasingly rely on multi-step, multi-tool workflows to complete real tasks. This design expands the attack surface, because data produced by one tool can be persisted an...

Jiangrong Wu, Zitong Yao, Yuhong Nan, Zibin Zheng

2603.12614 2026-03-13
AI LLM

InterDeepResearch: Enabling Human-Agent Collaborative Information Seeking through Interactive Deep Research

Deep research systems powered by LLM agents have transformed complex information seeking by automating the iterative retrieval, filtering, and synthesis of insights from massive-scale web sources. ...

Bo Pan, Lunke Pan, Yitao Zhou, Qi Jiang, Zhen Wen, Minfeng Zhu, Wei Chen

2603.12608 2026-03-13
AI LLM

A2Z-10M+: Geometric Deep Learning with A-to-Z BRep Annotations for AI-Assisted CAD Modeling and Reverse Engineering

Reverse engineering and rapid prototyping of computer-aided design (CAD) models from 3D scans, sketches, or simple text prompts are vital in industrial product design. However, recent advances in g...

Pritham Kumar Jena, Bhavika Baburaj, Tushar Anand, Vedant Dutta, Vineeth Ulavala, Sk Aziz Ali

2603.12605 2026-03-13
AI LLM

How GenAI Mentor Configurations Shape Early Collaborative Dynamics: A Classroom Comparison of Individual and Shared Agents

Generative artificial intelligence (GenAI) is increasingly embedded in computer-supported collaborative learning (CSCL), yet little empirical research has unpacked how different configurations of A...

Siyu Zha, Weijing Liu, Fei Qin, Jie Cao, Yanjin Wang, Yujia Liu, Kaiyi Zhang, Jiangtao Gong, Ying...

2603.12600 2026-03-13
AI LLM

Feynman: Knowledge-Infused Diagramming Agent for Scalable Visual Designs

Visual design is an essential application of state-of-the-art multi-modal AI systems. Improving these systems requires high-quality vision-language data at scale. Despite the abundance of internet ...

Zixin Wen, Yifu Cai, Kyle Lee, Sam Estep, Josh Sunshine, Aarti Singh, Yuejie Chi, Wode Ni

2603.12597 2026-03-13
AI LLM

Swap-guided Preference Learning for Personalized Reinforcement Learning from Human Feedback

Reinforcement Learning from Human Feedback (RLHF) is a widely used approach to align large-scale AI systems with human values. However, RLHF typically assumes a single, universal reward, which over...

Gihoon Kim, Euntai Kim

2603.12595 2026-03-13
AI LLM

Expert Pyramid Tuning: Efficient Parameter Fine-Tuning for Expertise-Driven Task Allocation

Parameter-Efficient Fine-Tuning (PEFT) has become a dominant paradigm for deploying LLMs in multi-task scenarios due to its extreme parameter efficiency. While Mixture-of-Experts (MoE) based LoRA v...

Jia-Chen Zhang, Zhen-Wei Yan, Yu-Jie Xiong, Chun-Ming Xia

2603.12577 2026-03-13
AI LLM

From Woofs to Words: Towards Intelligent Robotic Guide Dogs with Verbal Communication

Assistive robotics is an important subarea of robotics that focuses on the well-being of people with disabilities. A robotic guide dog is an assistive quadruped robot that helps visually impaired p...

Yohei Hayamizu, David DeFazio, Hrudayangam Mehta, Zainab Altaweel, Jacqueline Choe, Chao Lin, Jak...

2603.12574 2026-03-13
AI LLM

LMEB: Long-horizon Memory Embedding Benchmark

Memory embeddings are crucial for memory-augmented systems, such as OpenClaw, but their evaluation is underexplored in current text embedding benchmarks, which narrowly focus on traditional passage...

Xinping Zhao, Xinshuo Hu, Jiaxin Xu, Danyu Tang, Xin Zhang, Mengjia Zhou, Yan Zhong, Yao Zhou, Zi...

2603.12572 2026-03-13
AI LLM

Speech-Worthy Alignment for Japanese SpeechLLMs via Direct Preference Optimization

SpeechLLMs typically combine ASR-trained encoders with text-based LLM backbones, leading them to inherit written-style output patterns unsuitable for text-to-speech synthesis. This mismatch is part...

Mengjie Zhao, Lianbo Liu, Yusuke Fujita, Hao Shi, Yuan Gao, Roman Koshkin, Yui Sudo

2603.12565 2026-03-13
AI LLM

AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents

Tool-augmented LLM agents increasingly serve as multi-turn advisors in high-stakes domains, yet their evaluation relies on ranking-quality metrics that measure what is recommended but not whether i...

Zekun Wu, Adriano Koshiyama, Sahan Bulathwela, Maria Perez-Ortiz

2603.12564 2026-03-13
AI LLM

Large Language Models as Delivery Rider: Generating Instant Food Delivery Riders' Routing Decision with LLM Agent Framework

The utilization of Large Language Models (LLMs) to power human-like agents has shown remarkable potential in simulating individual mobility pattern. However, a significant gap remains in modeling c...

Chengbo Zhang, Zuopeng Xiao

2603.12559 2026-03-13
AI LLM

Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages

Reinforcement learning (RL) has been effective for post-training autoregressive (AR) language models, but extending these methods to diffusion language models (DLMs) is challenging due to intractab...

Vishnu Teja Kunde, Fatemeh Doudi, Mahdi Farahbakhsh, Dileep Kalathil, Krishna Narayanan, Jean-Fra...

2603.12554 2026-03-13