Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

UniMotion: A Unified Framework for Motion-Text-Vision Understanding and Generation

We present UniMotion, to our knowledge the first unified framework for simultaneous understanding and generation of human motion, natural language, and RGB images within a single architecture. Exis...

Ziyi Wang, Xinshun Wang, Shuang Chen, Yang Cong, Mengyuan Liu

2603.22282 2026-03-23
AI LLM

3D-Layout-R1: Structured Reasoning for Language-Instructed Spatial Editing

Large Language Models (LLMs) and Vision Language Models (VLMs) have shown impressive reasoning abilities, yet they struggle with spatial understanding and layout consistency when performing fine-gr...

Haoyu Zhen, Xiaolong Li, Yilin Zhao, Han Zhang, Sifei Liu, Kaichun Mo, Chuang Gan, Subhashree Rad...

2603.22279 2026-03-23
AI LLM

Greater accessibility can amplify discrimination in generative AI

Hundreds of millions of people rely on large language models (LLMs) for education, work, and even healthcare. Yet these models are known to reproduce and amplify social biases present in their trai...

Carolin Holtermann, Minh Duc Bui, Kaitlyn Zhou, Valentin Hofmann, Katharina von der Wense, Anne L...

2603.22260 2026-03-23
AI LLM

EgoGroups: A Benchmark For Detecting Social Groups of People in the Wild

Social group detection, or the identification of humans involved in reciprocal interpersonal interactions (e.g., family members, friends, and customers and merchants), is a crucial component of soc...

Jeffri Murrugarra-Llerena, Pranav Chitale, Zicheng Liu, Kai Ao, Yujin Ham, Guha Balakrishnan, Pao...

2603.22249 2026-03-23
AI LLM

SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation

Recent advances in text-to-image (T2I) generation via reinforcement learning (RL) have benefited from reward models that assess semantic alignment and visual quality. However, most existing reward ...

Sashuai Zhou, Qiang Zhou, Junpeng Ma, Yue Cao, Ruofan Hu, Ziang Zhang, Xiaoda Yang, Zhibin Wang, ...

2603.22228 2026-03-23
AI LLM

Dyadic: A Scalable Platform for Human-Human and Human-AI Conversation Research

Conversation is ubiquitous in social life, but the empirical study of this interactive process has been thwarted by tools that are insufficiently modular and unadaptive to researcher needs. To reli...

David M. Markowitz

2603.22227 2026-03-23
AI LLM

Evaluating the Reliability and Fidelity of Automated Judgment Systems of Large Language Models

A Large Language Model (LLM) as judge evaluates the quality of victim Machine Learning (ML) models, specifically LLMs, by analyzing their outputs. An LLM as judge is the combination of one model an...

Tom Biskupski, Stephan Kleber

2603.22214 2026-03-23
AI LLM

SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection

While large language models (LLMs) are pretrained on massive amounts of data, their knowledge coverage remains incomplete in specialized, data-scarce domains, motivating extensive efforts to study ...

Kexian Tang, Jiani Wang, Shaowen Wang, Kaifeng Lyu

2603.22213 2026-03-23
AI LLM

Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models

Video--based world models have emerged along two dominant paradigms: video generation and 3D reconstruction. However, existing evaluation benchmarks either focus narrowly on visual fidelity and tex...

Meiqi Wu, Zhixin Cai, Fufangchen Zhao, Xiaokun Feng, Rujing Dang, Bingze Song, Ruitian Tian, Jias...

2603.22212 2026-03-23
AI LLM

Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

Multi-agent applications often execute complex tasks as multi-stage workflows, where each stage is an LLM call whose output becomes part of context for subsequent steps. Existing LLM serving system...

Kangqi Ni, Wenyue Hua, Xiaoxiang Shi, Jiang Guo, Shiyu Chang, Tianlong Chen

2603.22206 2026-03-23
AI LLM

CayleyPy-4: AI-Holography. Towards analogs of holographic string dualities for AI tasks

This is the fourth paper in the CayleyPy project, which applies AI methods to the exploration of large graphs. In this work, we suggest the existence of a new discrete version of holographic string...

A. Chervov, F. Levkovich-Maslyuk, A. Smolensky, F. Khafizov, I. Kiselev, D. Melnikov, I. Koltsov,...

2603.22195 2026-03-23
AI LLM

PAM: A Pose-Appearance-Motion Engine for Sim-to-Real HOI Video Generation

Hand-object interaction (HOI) reconstruction and synthesis are becoming central to embodied AI and AR/VR. Yet, despite rapid progress, existing HOI generation research remains fragmented across thr...

Mingju Gao, Kaisen Yang, Huan-ang Gao, Bohan Li, Ao Ding, Wenyi Li, Yangcheng Yu, Jinkun Liu, Sha...

2603.22193 2026-03-23
AI LLM

Enhancing Document-Level Machine Translation via Filtered Synthetic Corpora and Two-Stage LLM Adaptation

In Machine Translation, Large Language Models (LLMs) have generally underperformed compared to conventional encoder-decoder systems and thus see limited adoption. However, LLMs excel at modeling co...

Ireh Kim, Tesia Sker, Chanwoo Kim

2603.22186 2026-03-23
AI LLM

Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

Recent advances in large language models (LLMs) have enabled the automation of an increasing number of programming tasks, including code generation for scientific and engineering domains. In rapidl...

Oscar Novo, Oscar Bastidas-Jossa, Alberto Calvo, Antonio Peris, Carlos Kuchkovsky

2603.22184 2026-03-23
AI LLM

MARCUS: An agentic, multimodal vision-language model for cardiac diagnosis and management

Cardiovascular disease remains the leading cause of global mortality, with progress hindered by human interpretation of complex cardiac tests. Current AI vision-language models are limited to singl...

Jack W O'Sullivan, Mohammad Asadi, Lennart Elbe, Akshay Chaudhari, Tahoura Nedaee, Francois Hadda...

2603.22179 2026-03-23
AI LLM

Causal Evidence that Language Models use Confidence to Drive Behavior

Metacognition -- the ability to assess one's own cognitive performance -- is documented across species, with internal confidence estimates serving as a key signal for adaptive behavior. While confi...

Dharshan Kumaran, Nathaniel Daw, Simon Osindero, Petar Velickovic, Viorica Patraucean

2603.22161 2026-03-23
AI LLM

Multimodal Survival Analysis with Locally Deployable Large Language Models

We study multimodal survival analysis integrating clinical text, tabular covariates, and genomic profiles using locally deployable large language models (LLMs). As many institutions face tight comp...

Moritz Gögl, Christopher Yau

2603.22158 2026-03-23
AI LLM

More Isn't Always Better: Balancing Decision Accuracy and Conformity Pressures in Multi-AI Advice

Just as people improve decision-making by consulting diverse human advisors, they can now also consult with multiple AI systems. Prior work on group decision-making shows that advice aggregation cr...

Yuta Tsuchiya, Yukino Baba

2603.22152 2026-03-23
AI LLM

The Semantic Ladder: A Framework for Progressive Formalization of Natural Language Content for Knowledge Graphs and AI Systems

Semantic data and knowledge infrastructures must reconcile two fundamentally different forms of representation: natural language, in which most knowledge is created and communicated, and formal sem...

Lars Vogt

2603.22136 2026-03-23
AI LLM

Navigational Thinking as an Emerging Paradigm of Computer Science in the Age of Generative AI

Generative AI systems produce meaning with a quality indistinguishable from - and occasionally surpassing - human performance, yet the epistemic mechanism through which this occurs remains poorly u...

Ilya Levin

2603.22133 2026-03-23