Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM Personality

Human problem-solving is enriched by a diversity of styles and personality traits, yet the development of Large Language Models (LLMs) has largely prioritized uniform performance benchmarks that fa...

Xi Wang, Mengdie Zhuang, Jiqun Liu

2603.06088 2026-03-06
AI LLM

Lyapunov Probes for Hallucination Detection in Large Foundation Models

We address hallucination detection in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) by framing the problem through the lens of dynamical systems stability theory. Rather...

Bozhi Luan, Gen Li, Yalan Qin, Jifeng Guo, Yun Zhou, Faguo Wu, Hongwei Zheng, Wenjun Wu, Zhaoxin Fan

2603.06081 2026-03-06
AI LLM

Distributed Semantic Alignment over Interference Channels: A Game-Theoretic Approach

Semantic communication acts as a key enabler for effective task execution in AI-driven systems, prioritizing the extraction of the underlying meaning before transmission. However, when devices rely...

Giuseppe Di Poce, Mattia Merluzzi, Emilio Calvanese Strinati, Paolo Di Lorenzo

2603.06077 2026-03-06
AI LLM

Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring

Automated Essay Scoring (AES) has been explored for decades with the goal to support teachers by reducing grading workload and mitigating subjective biases. While early systems relied on handcrafte...

Jonas Kubesch, Lena Huber, Clemens Havas

2603.06066 2026-03-06
AI LLM

ChatShopBuddy: Towards Reliable Conversational Shopping Agents via Reinforcement Learning

Conversational shopping agents represent a critical consumer-facing application of Large Language Model (LLM)-powered agents, yet how to effectively apply post-training Reinforcement Learning (RL) ...

Yiruo Cheng, Kelong Mao, Tianhao Li, Jiejun Tan, Ji-Rong Wen, Zhicheng Dou

2603.06065 2026-03-06
AI LLM

Agentic LLM Planning via Step-Wise PDDL Simulation: An Empirical Characterisation

Task planning, the problem of sequencing actions to reach a goal from an initial state, is a core capability requirement for autonomous robotic systems. Whether large language models (LLMs) can ser...

Kai Göbel, Pierrick Lorang, Patrik Zips, Tobias Glück

2603.06064 2026-03-06
AI LLM

A LINDDUN-based Privacy Threat Modeling Framework for GenAI

As generative AI (GenAI) systems become increasingly prevalent across various technological stacks, the question of how such systems handle sensitive and personal data flows becomes increasingly im...

Qianying Liao, Jonah Bellemans, Laurens Sion, Xue Jiang, Dmitrii Usynin, Xuebing Zhou, Dimitri Va...

2603.06051 2026-03-06
AI LLM

Pre-AI Baseline: Developer IDE Satisfaction and Tool Autonomy in 2022

To quantify the impact of AI on software development, the community requires a robust pre-AI baseline. This study analyzes valid satisfaction data from 1,155 software developers collected in July 2...

Nikola Balić

2603.06050 2026-03-06
AI LLM

Detecting Semantic Alignments between Textual Specifications and Domain Models

Context: Having domain models derived from textual specifications has proven to be very useful in the early phases of software engineering. However, creating correct domain models and establishing ...

Shwetali Shimangaud, Lola Burgueño, Rijul Saini, Jörg Kienzle

2603.06037 2026-03-06
AI LLM

Is it Me? Toward Self-Extension to AI Avatars in Virtual Reality

Advances in generative AI, speech synthesis, and embodied avatars enable systems that not only assist communication, but can act as proxies on users' behalf. Prior work in HCI has largely focused o...

Jieying Zhang, Steeven Villa, Abdallah El Ali

2603.06030 2026-03-06
AI LLM

Sensitivity-Aware Retrieval-Augmented Intent Clarification

In conversational search systems, a key component is to determine and clarify the intent behind complex queries. We view intent clarification in light of the exploratory search paradigm, where user...

Maik Larooij

2603.06025 2026-03-06
AI LLM

MASFactory: A Graph-centric Framework for Orchestrating LLM-Based Multi-Agent Systems with Vibe Graphing

Large language model-based (LLM-based) multi-agent systems (MAS) are increasingly used to extend agentic problem solving via role specialization and collaboration. MAS workflows can be naturally mo...

Yang Liu, Jinxuan Cai, Yishen Li, Qi Meng, Zedi Liu, Xin Li, Chen Qian, Chuan Shi, Cheng Yang

2603.06007 2026-03-06
AI LLM

EvoESAP: Non-Uniform Expert Pruning for Sparse MoE

Sparse Mixture-of-Experts (SMoE) language models achieve strong capability at low per-token compute, yet deployment remains memory- and throughput-bound because the full expert pool must be stored ...

Zongfang Liu, Shengkun Tang, Boyang Sun, Zhiqiang Shen, Xin Yuan

2603.06003 2026-03-06
AI LLM

MM-ISTS: Cooperating Irregularly Sampled Time Series Forecasting with Multimodal Vision-Text LLMs

Irregularly sampled time series (ISTS) are widespread in real-world scenarios, exhibiting asynchronous observations on uneven time intervals across variables. Existing ISTS forecasting methods ofte...

Zhi Lei, Chenxi Liu, Hao Miao, Wanghui Qiu, Bin Yang, Chenjuan Guo

2603.05997 2026-03-06
AI LLM

Fostering Knowledge Infrastructures in Science Communication and Aerospace Engineering

Knowledge infrastructures are defined as robust networks of people, artifacts, and institutions that generate, share and maintain specific knowledge. Yet, many domains are fragmented and far from r...

Tim Wittenborg

2603.05984 2026-03-06
AI LLM

An Interactive Multi-Agent System for Evaluation of New Product Concepts

Product concept evaluation is a critical stage that determines strategic resource allocation and project success in enterprises. However, traditional expert-led approaches face limitations such as ...

Bin Xuan, Ruo Ai, Hakyeon Lee

2603.05980 2026-03-06
AI LLM

Balancing Latency and Accuracy of Code Completion via Local-Cloud Model Cascading

Line-level code completion requires a critical balance between high accuracy and low latency. Existing methods suffer from a trade-off: large language models (LLMs) provide high-quality suggestions...

Hanzhen Lu, Lishui Fan, Jiachi Chen, Qiuyuan Chen, Zhao Wei, Zhongxin Liu

2603.05974 2026-03-06
AI LLM

THETA: A Textual Hybrid Embedding-based Topic Analysis Framework and AI Scientist Agent for Scalable Computational Social Science

The explosion of big social data has created a scalability trap for traditional qualitative research, as manual coding remains labor-intensive and conventional topic models often suffer from semant...

Zhenke Duan, Xin Li

2603.05972 2026-03-06
AI LLM

SwinYNet: A Transformer-based Multi-Task Model for Accurate and Efficient FRB Search

In this study, we present a transformer-based multi-task model for Fast Radio Burst (FRB) detection, signal segmentation, and parameter estimation directly from time-frequency data, without requiri...

Yunchuan Chen, Shulei Ni, Chan Li, Jianhua Fang, Dengke Zhou, Huaxi Chen, Yi Feng, Pei Wang, Chen...

2603.05958 2026-03-06
AI LLM

XAI for Coding Agent Failures: Transforming Raw Execution Traces into Actionable Insights

Large Language Model (LLM)-based coding agents show promise in automating software development tasks, yet they frequently fail in ways that are difficult for developers to understand and debug. Whi...

Arun Joshi

2603.05941 2026-03-06