Papers
Research papers from arXiv and related sources
Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM Personality
Human problem-solving is enriched by a diversity of styles and personality traits, yet the development of Large Language Models (LLMs) has largely prioritized uniform performance benchmarks that fa...
Xi Wang, Mengdie Zhuang, Jiqun Liu
Lyapunov Probes for Hallucination Detection in Large Foundation Models
We address hallucination detection in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) by framing the problem through the lens of dynamical systems stability theory. Rather...
Bozhi Luan, Gen Li, Yalan Qin, Jifeng Guo, Yun Zhou, Faguo Wu, Hongwei Zheng, Wenjun Wu, Zhaoxin Fan
Distributed Semantic Alignment over Interference Channels: A Game-Theoretic Approach
Semantic communication acts as a key enabler for effective task execution in AI-driven systems, prioritizing the extraction of the underlying meaning before transmission. However, when devices rely...
Giuseppe Di Poce, Mattia Merluzzi, Emilio Calvanese Strinati, Paolo Di Lorenzo
Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring
Automated Essay Scoring (AES) has been explored for decades with the goal to support teachers by reducing grading workload and mitigating subjective biases. While early systems relied on handcrafte...
Jonas Kubesch, Lena Huber, Clemens Havas
ChatShopBuddy: Towards Reliable Conversational Shopping Agents via Reinforcement Learning
Conversational shopping agents represent a critical consumer-facing application of Large Language Model (LLM)-powered agents, yet how to effectively apply post-training Reinforcement Learning (RL) ...
Yiruo Cheng, Kelong Mao, Tianhao Li, Jiejun Tan, Ji-Rong Wen, Zhicheng Dou
Agentic LLM Planning via Step-Wise PDDL Simulation: An Empirical Characterisation
Task planning, the problem of sequencing actions to reach a goal from an initial state, is a core capability requirement for autonomous robotic systems. Whether large language models (LLMs) can ser...
Kai Göbel, Pierrick Lorang, Patrik Zips, Tobias Glück
A LINDDUN-based Privacy Threat Modeling Framework for GenAI
As generative AI (GenAI) systems become increasingly prevalent across various technological stacks, the question of how such systems handle sensitive and personal data flows becomes increasingly im...
Qianying Liao, Jonah Bellemans, Laurens Sion, Xue Jiang, Dmitrii Usynin, Xuebing Zhou, Dimitri Va...
Pre-AI Baseline: Developer IDE Satisfaction and Tool Autonomy in 2022
To quantify the impact of AI on software development, the community requires a robust pre-AI baseline. This study analyzes valid satisfaction data from 1,155 software developers collected in July 2...
Nikola Balić
Detecting Semantic Alignments between Textual Specifications and Domain Models
Context: Having domain models derived from textual specifications has proven to be very useful in the early phases of software engineering. However, creating correct domain models and establishing ...
Shwetali Shimangaud, Lola Burgueño, Rijul Saini, Jörg Kienzle
Is it Me? Toward Self-Extension to AI Avatars in Virtual Reality
Advances in generative AI, speech synthesis, and embodied avatars enable systems that not only assist communication, but can act as proxies on users' behalf. Prior work in HCI has largely focused o...
Jieying Zhang, Steeven Villa, Abdallah El Ali
Sensitivity-Aware Retrieval-Augmented Intent Clarification
In conversational search systems, a key component is to determine and clarify the intent behind complex queries. We view intent clarification in light of the exploratory search paradigm, where user...
Maik Larooij
MASFactory: A Graph-centric Framework for Orchestrating LLM-Based Multi-Agent Systems with Vibe Graphing
Large language model-based (LLM-based) multi-agent systems (MAS) are increasingly used to extend agentic problem solving via role specialization and collaboration. MAS workflows can be naturally mo...
Yang Liu, Jinxuan Cai, Yishen Li, Qi Meng, Zedi Liu, Xin Li, Chen Qian, Chuan Shi, Cheng Yang
EvoESAP: Non-Uniform Expert Pruning for Sparse MoE
Sparse Mixture-of-Experts (SMoE) language models achieve strong capability at low per-token compute, yet deployment remains memory- and throughput-bound because the full expert pool must be stored ...
Zongfang Liu, Shengkun Tang, Boyang Sun, Zhiqiang Shen, Xin Yuan
MM-ISTS: Cooperating Irregularly Sampled Time Series Forecasting with Multimodal Vision-Text LLMs
Irregularly sampled time series (ISTS) are widespread in real-world scenarios, exhibiting asynchronous observations on uneven time intervals across variables. Existing ISTS forecasting methods ofte...
Zhi Lei, Chenxi Liu, Hao Miao, Wanghui Qiu, Bin Yang, Chenjuan Guo
Fostering Knowledge Infrastructures in Science Communication and Aerospace Engineering
Knowledge infrastructures are defined as robust networks of people, artifacts, and institutions that generate, share and maintain specific knowledge. Yet, many domains are fragmented and far from r...
Tim Wittenborg
An Interactive Multi-Agent System for Evaluation of New Product Concepts
Product concept evaluation is a critical stage that determines strategic resource allocation and project success in enterprises. However, traditional expert-led approaches face limitations such as ...
Bin Xuan, Ruo Ai, Hakyeon Lee
Balancing Latency and Accuracy of Code Completion via Local-Cloud Model Cascading
Line-level code completion requires a critical balance between high accuracy and low latency. Existing methods suffer from a trade-off: large language models (LLMs) provide high-quality suggestions...
Hanzhen Lu, Lishui Fan, Jiachi Chen, Qiuyuan Chen, Zhao Wei, Zhongxin Liu
THETA: A Textual Hybrid Embedding-based Topic Analysis Framework and AI Scientist Agent for Scalable Computational Social Science
The explosion of big social data has created a scalability trap for traditional qualitative research, as manual coding remains labor-intensive and conventional topic models often suffer from semant...
Zhenke Duan, Xin Li
SwinYNet: A Transformer-based Multi-Task Model for Accurate and Efficient FRB Search
In this study, we present a transformer-based multi-task model for Fast Radio Burst (FRB) detection, signal segmentation, and parameter estimation directly from time-frequency data, without requiri...
Yunchuan Chen, Shulei Ni, Chan Li, Jianhua Fang, Dengke Zhou, Huaxi Chen, Yi Feng, Pei Wang, Chen...
XAI for Coding Agent Failures: Transforming Raw Execution Traces into Actionable Insights
Large Language Model (LLM)-based coding agents show promise in automating software development tasks, yet they frequently fail in ways that are difficult for developers to understand and debug. Whi...
Arun Joshi