Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

DetPO: In-Context Learning with Multi-Modal LLMs for Few-Shot Object Detection

Multi-Modal LLMs (MLLMs) demonstrate strong visual grounding capabilities on popular object detection benchmarks like OdinW-13 and RefCOCO. However, state-of-the-art models still struggle to genera...

Gautam Rajendrakumar Gare, Neehar Peri, Matvei Popov, Shruti Jain, John Galeotti, Deva Ramanan

2603.23455 2026-03-24
AI LLM

Code Review Agent Benchmark

Software engineering agents have shown significant promise in writing code. As AI agents permeate code writing, and generate huge volumes of code automatically -- the matter of code quality comes f...

Yuntong Zhang, Zhiyuan Pan, Imam Nur Bani Yusuf, Haifeng Ruan, Ridwan Shariffdeen, Abhik Roychoud...

2603.23448 2026-03-24
AI LLM

3DCity-LLM: Empowering Multi-modality Large Language Models for 3D City-scale Perception and Understanding

While multi-modality large language models excel in object-centric or indoor scenarios, scaling them to 3D city-scale environments remains a formidable challenge. To bridge this gap, we propose 3DC...

Yiping Chen, Jinpeng Li, Wenyu Ke, Yang Luo, Jie Ouyang, Zhongjie He, Li Liu, Hongchao Fan, Hao Wu

2603.23447 2026-03-24
AI LLM

Evaluating LLM-Based Test Generation Under Software Evolution

Large Language Models (LLMs) are increasingly used for automated unit test generation. However, it remains unclear whether these tests reflect genuine reasoning about program behavior or simply rep...

Sabaat Haroon, Mohammad Taha Khan, Muhammad Ali Gulzar

2603.23443 2026-03-24
TESTING

MuSe: a Mutation Testing Plugin for the Remix IDE

Mutation testing is a technique to assess the effectiveness of test suites by introducing artificial faults into programs. Although mutation testing plugins are available for many platforms and lan...

Gerardo Iuliano, Daniele Carangelo, Carmine Calabrese, Dario Di Nucci

2603.23441 2026-03-24
AI LLM

Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning

Machine learning models often need to adapt to new data after deployment due to structured or unstructured real-world dynamics. The Continual Learning (CL) framework enables continuous model adapta...

Connor Mclaughlin, Nigel Lee, Lili Su

2603.23436 2026-03-24
AI LLM

Mecha-nudges for Machines

Nudges are subtle changes to the way choices are presented to human decision-makers (e.g., opt-in vs. opt-out by default) that shift behavior without restricting options or changing incentives. As ...

Giulio Frey, Kawin Ethayarajh

2603.23433 2026-03-24
TESTING

Quantum simulation of Motzkin spin chain with Rydberg atoms

Motzkin spin chain is a well-known mathematical model with connections to symmetry-protected topological phases, such as the Haldane phase, as well as to concepts in the AdS/CFT correspondence. The...

Kaustav Mukherjee, Hatem Barghathi, Adrian Del Maestro, Rick Mukherjee

2603.23422 2026-03-24
AI LLM

Bilevel Autoresearch: Meta-Autoresearching Itself

If autoresearch is itself a form of research, then autoresearch can be applied to research itself. We take this idea literally: we use an autoresearch loop to optimize the autoresearch loop. Every ...

Yaonan Qu, Meng Lu

2603.23420 2026-03-24
AI LLM

Biased Error Attribution in Multi-Agent Human-AI Systems Under Delayed Feedback

Human decision-making is strongly influenced by cognitive biases, particularly under conditions of uncertainty and risk. While prior work has examined bias in single-step decisions with immediate o...

Teerthaa Parakh, Karen M. Feigh

2603.23419 2026-03-24
AI LLM

Integrating GenAI in Filmmaking: From Co-Creativity to Distributed Creativity

The integration of Generative AI (GenAI) into audio-visual production is often presented as a radical break from past traditions. However, through a sociomaterial and historical lens, this paper ar...

Pierluigi Masai, Lorenzo Carta, Mateusz Miroslaw Lis

2603.23415 2026-03-24
AI LLM

SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling

Scaling reinforcement learning (RL) has shown strong promise for enhancing the reasoning abilities of large language models (LLMs), particularly in tasks requiring long chain-of-thought generation....

Yiqi Zhang, Huiqiang Jiang, Xufang Luo, Zhihe Yang, Chengruidong Zhang, Yifei Shen, Dongsheng Li,...

2603.23414 2026-03-24
AI LLM

Beyond Preset Identities: How Agents Form Stances and Boundaries in Generative Societies

While large language models simulate social behaviors, their capacity for stable stance formation and identity negotiation during complex interventions remains unclear. To overcome the limitations ...

Hanzhong Zhang, Siyang Song, Jindong Wang

2603.23406 2026-03-24
AI LLM

Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning

Existing Multimodal Large Language Models (MLLMs) struggle with 3D spatial reasoning, as they fail to construct structured abstractions of the 3D environment depicted in video inputs. To bridge thi...

Jiacheng Hua, Yishu Yin, Yuhang Wu, Tai Wang, Yifei Huang, Miao Liu

2603.23404 2026-03-24
TESTING

Graph Energy Matching: Transport-Aligned Energy-Based Modeling for Graph Generation

Energy-based models for discrete domains, such as graphs, explicitly capture relative likelihoods, naturally enabling composable probabilistic inference tasks like conditional generation or enforci...

Michal Balcerak, Suprosana Shit, Chinmay Prabhakar, Sebastian Kaltenbach, Michael S. Albergo, Yil...

2603.23398 2026-03-24
TESTING

Piecewise M-Stationarity and Related Algorithms for Mathematical Programs with Complementarity Constraints

This study explores B-stationarity of mathematical programs with complementarity constraints (MPCCs) and convergence behavior of MPCC algorithms. Special attention is given to the cases with biacti...

Kexin Wang, Lorenz T. Biegler

2603.23389 2026-03-24
AI LLM

SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM

High-quality articulated 3D assets are indispensable for embodied AI and physical simulation, yet 3D generation still focuses on static meshes, leaving a gap in "sim-ready" interactive objects. Mos...

Chuanrui Zhang, Minghan Qin, Yuang Wang, Baifeng Xie, Hang Li, Ziwei Wang

2603.23386 2026-03-24
TESTING

Shape-Adaptive Conditional Calibration for Conformal Prediction via Minimax Optimization

Achieving valid conditional coverage in conformal prediction is challenging due to the theoretical difficulty of satisfying pointwise constraints in finite samples. Building upon the characterizati...

Yajie Bao, Chuchen Zhang, Zhaojun Wang, Haojie Ren, Changliang Zou

2603.23374 2026-03-24
TESTING

An Opacity-Free Test of the Cosmic Distance Duality Relation Using Strongly Lensed Gravitational Wave Signals with Space-Based Detector Networks

The cosmic distance duality relation (CDDR), expressed as $d_L(z) = (1+z)^2 D_A(z)$, is a fundamental relation in modern cosmology. In this work, we apply a method to test the CDDR using simulated ...

Yong Yuan, Minghui Du, Benyang Zhu, Xin-yi Lin, Wen-Fan Feng, Peng Xu, Xilong Fan

2603.23373 2026-03-24
AI LLM

Central Dogma Transformer III: Interpretable AI Across DNA, RNA, and Protein

Biological AI models increasingly predict complex cellular responses, yet their learned representations remain disconnected from the molecular processes they aim to capture. We present CDT-III, whi...

Nobuyuki Ota

2603.23361 2026-03-24