Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
TESTING

Stochastic set-valued optimization and its application to robust learning

In this paper, we develop a stochastic set-valued optimization (SVO) framework tailored for robust machine learning. In the SVO setting, each decision variable is mapped to a set of objective value...

Tommaso Giovannelli, Jingfu Tan, Luis Nunes Vicente

2603.17691 2026-03-18
AI LLM

Sensi: Learn One Thing at a Time -- Curriculum-Based Test-Time Learning for LLM Game Agents

Large language model (LLM) agents deployed in unknown environments must learn task structure at test time, but current approaches require thousands of interactions to form useful hypotheses. We pre...

Mohsen Arjmandi

2603.17683 2026-03-18
AI LLM

WeatherReasonSeg: A Benchmark for Weather-Aware Reasoning Segmentation in Visual Language Models

Existing vision-language models (VLMs) have demonstrated impressive performance in reasoning-based segmentation. However, current benchmarks are primarily constructed from high-quality images captu...

Wanjun Du, Zifeng Yuan, Tingting Chen, Fucai Ke, Beibei Lin, Shunli Zhang

2603.17680 2026-03-18
AI LLM

Post-Training Local LLM Agents for Linux Privilege Escalation with Verifiable Rewards

LLM agents are increasingly relevant to research domains such as vulnerability discovery. Yet, the strongest systems remain closed and cloud-only, making them resource-intensive, difficult to repro...

Philipp Normann, Andreas Happe, Jürgen Cito, Daniel Arp

2603.17673 2026-03-18
AI LLM

AgentVLN: Towards Agentic Vision-and-Language Navigation

Vision-and-Language Navigation (VLN) requires an embodied agent to ground complex natural-language instructions into long-horizon navigation in unseen environments. While Vision-Language Models (VL...

Zihao Xin, Wentong Li, Yixuan Jiang, Ziyuan Huang, Bin Wang, Piji Li, Jianke Zhu, Jie Qin, Shengj...

2603.17670 2026-03-18
AI LLM

Halo: Domain-Aware Query Optimization for Long-Context Question Answering

Long-context question answering (QA) over lengthy documents is critical for applications such as financial analysis, legal review, and scientific research. Current approaches, such as processing en...

Pramod Chunduri, Francisco Romero, Ali Payani, Kexin Rong, Joy Arulraj

2603.17668 2026-03-18
AI LLM

From Symbol to Meaning: Ontological and Philosophical Reflections on Large Language Models in Information Systems Engineering

The advent of Large Language Models (LLMs) represents a turning point in the theoretical foundations of Information Systems Engineering. Beyond their technical significance, LLMs challenge the onto...

José Palazzo Moreira de Oliveira

2603.17659 2026-03-18
TESTING

Requirements Volatility in Software Architecture Design: An Exploratory Case Study

Requirements volatility is a major issue in software (SW) development, causing problems such as project delays and cost overruns. Even though there is a considerable amount of research related to r...

Sanja Aaramaa, Sandun Dasanayake, Markku Oivo, Jouni Markkula, Samuli Saukkonen

2603.17648 2026-03-18
AI LLM

Part-Aware Open-Vocabulary 3D Affordance Grounding via Prototypical Semantic and Geometric Alignment

Grounding natural language questions to functionally relevant regions in 3D objects -- termed language-driven 3D affordance grounding -- is essential for embodied intelligence and human-AI interact...

Dongqiang Gou, Xuming He

2603.17647 2026-03-18
AI LLM

Who's Sense is This? Possibility for Impacting Human Insights in AI-assisted Sensemaking

Sensemaking is an important preceding step for activities like consensus building and decision-making. When groups of people make sense of large amounts of information, their understanding graduall...

Zhuoyi Cheng, Steven Houben

2603.17643 2026-03-18
AI LLM

VeriGrey: Greybox Agent Validation

Agentic AI has been a topic of great interest recently. A Large Language Model (LLM) agent involves one or more LLMs in the back-end. In the front end, it conducts autonomous decision-making by com...

Yuntong Zhang, Sungmin Kang, Ruijie Meng, Marcel Böhme, Abhik Roychoudhury

2603.17639 2026-03-18
TESTING

DSS-GAN: Directional State Space GAN with Mamba backbone for Class-Conditional Image Synthesis

We present DSS-GAN, the first generative adversarial network to employ Mamba as a hierarchical generator backbone for noise-to-image synthesis. The central contribution is Directional Latent Routin...

Aleksander Ogonowski, Konrad Klimaszewski, Przemysław Rokita

2603.17637 2026-03-18
AI LLM

A Multi-Agent System for Building-Age Cohort Mapping to Support Urban Energy Planning

Determining the age distribution of the urban building stock is crucial for sustainable municipal heat planning and upgrade prioritization. However, existing approaches often rely on datasets gathe...

Kundan Thota, Thorsten Schlachter, Veit Hagenmeyer

2603.17626 2026-03-18
AI LLM

Do Language Models Encode Semantic Relations? Probing and Sparse Feature Analysis

Understanding whether large language models (LLMs) capture structured meaning requires examining how they represent concept relationships. In this work, we study three models of increasing scale: P...

Andor Diera, Ansgar Scherp

2603.17624 2026-03-18
AI LLM

Complementary Reinforcement Learning

Reinforcement Learning (RL) has emerged as a powerful paradigm for training LLM-based agents, yet remains limited by low sample efficiency, stemming not only from sparse outcome feedback but also f...

Dilxat Muhtar, Jiashun Liu, Wei Gao, Weixun Wang, Shaopan Xiong, Ju Huang, Siran Yang, Wenbo Su, ...

2603.17621 2026-03-18
AI LLM

VeriAgent: A Tool-Integrated Multi-Agent System with Evolving Memory for PPA-Aware RTL Code Generation

LLMs have recently demonstrated strong capabilities in automatic RTL code generation, achieving high syntactic and functional correctness. However, most methods focus on functional correctness whil...

Yaoxiang Wang, Qi Shi, ShangZhan Li, Qingguo Hu, Xinyu Yin, Bo Guo, Xu Han, Maosong Sun, Jinsong Su

2603.17613 2026-03-18
TESTING

On the validity limits of the parametrisation method for invariant manifolds: an assessment of practical criteria for vibrating systems

The parametrisation method for invariant manifolds is a powerful technique for deriving reduced-order models in the context of nonlinear vibrating systems, allowing accurate computations of nonline...

André de Figueiredo Stabile, Aurélien Grolet, Alessandra Vizzaccaro, Cyril Touzé

2603.17611 2026-03-18
TESTING

An optimal control approach to nonlinear wave speed selection in reaction-diffusion equations

Travelling wave solutions of reaction-diffusion equations are widely used to model the spatial spread of populations and other phenomena in biology and physics. In this article, we reinterpret the ...

Rebecca M. Crossley, Carles Falco, Ruth E. Baker

2603.17601 2026-03-18
TESTING

Zero entropy cycles on trees: from Topology to Combinatorics and an application to star maps

In this paper we give a fully combinatorial description of the zero entropy periodic patterns on trees. Unlike previously known characterizations of such patterns, our criterion is independent of a...

D. Juher, F. Mañosas, D. Rojas

2603.17598 2026-03-18
AI LLM

Modeling Changing Scientific Concepts with Complex Networks: A Case Study on the Chemical Revolution

While context embeddings produced by LLMs can be used to estimate conceptual change, these representations are often not interpretable nor time-aware. Moreover, bias augmentation in historical data...

Sofía Aguilar-Valdez, Stefania Degaetano-Ortlieb

2603.17594 2026-03-18