Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
TESTING

Mainly on the Plane: Observing the Extended, Ionized Disks of Milky Way Analogs in IllustrisTNG

This paper explores the extent to which the circumgalactic medium (CGM) of Milky Way-like galaxies is located in an extended, ionized, disklike structure. To test this hypothesis, we analyze the sp...

Michael Messere, Kirill Tchernyshyov, Mary E. Putman, Greg L. Bryan, Jessica K. Werk, Yong Zheng,...

2603.22257 2026-03-23
AI LLM

EgoGroups: A Benchmark For Detecting Social Groups of People in the Wild

Social group detection, or the identification of humans involved in reciprocal interpersonal interactions (e.g., family members, friends, and customers and merchants), is a crucial component of soc...

Jeffri Murrugarra-Llerena, Pranav Chitale, Zicheng Liu, Kai Ao, Yujin Ham, Guha Balakrishnan, Pao...

2603.22249 2026-03-23
TESTING

RotorMap and Quantum Fingerprints of DNA Sequences via Rotary Position Embeddings

For strings of letters from a small alphabet, such as DNA sequences, we present a quantum encoding that empirically provides a strong correlation between the Levenshtein edit distance and the fidel...

Danylo Yakymenko, Maksym Chernyshev, Illia Savchenko, Sergii Strelchuk

2603.22245 2026-03-23
TESTING

Benchmarking Deep Learning Models for Aerial LiDAR Point Cloud Semantic Segmentation under Real Acquisition Conditions: A Case Study in Navarre

Recent advances in deep learning have significantly improved 3D semantic segmentation, but most models focus on indoor or terrestrial datasets. Their behavior under real aerial acquisition conditio...

Alex Salvatierra, José Antonio Sanz, Christian Gutiérrez, Mikel Galar

2603.22229 2026-03-23
AI LLM

SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation

Recent advances in text-to-image (T2I) generation via reinforcement learning (RL) have benefited from reward models that assess semantic alignment and visual quality. However, most existing reward ...

Sashuai Zhou, Qiang Zhou, Junpeng Ma, Yue Cao, Ruofan Hu, Ziang Zhang, Xiaoda Yang, Zhibin Wang, ...

2603.22228 2026-03-23
AI LLM

Dyadic: A Scalable Platform for Human-Human and Human-AI Conversation Research

Conversation is ubiquitous in social life, but the empirical study of this interactive process has been thwarted by tools that are insufficiently modular and unadaptive to researcher needs. To reli...

David M. Markowitz

2603.22227 2026-03-23
TESTING

Noise Titration: Exact Distributional Benchmarking for Probabilistic Time Series Forecasting

Modern time series forecasting is evaluated almost entirely through passive observation of single historical trajectories, rendering claims about a model's robustness to non-stationarity fundamenta...

Qilin Wang

2603.22219 2026-03-23
AI LLM

Evaluating the Reliability and Fidelity of Automated Judgment Systems of Large Language Models

A Large Language Model (LLM) as judge evaluates the quality of victim Machine Learning (ML) models, specifically LLMs, by analyzing their outputs. An LLM as judge is the combination of one model an...

Tom Biskupski, Stephan Kleber

2603.22214 2026-03-23
AI LLM

SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection

While large language models (LLMs) are pretrained on massive amounts of data, their knowledge coverage remains incomplete in specialized, data-scarce domains, motivating extensive efforts to study ...

Kexian Tang, Jiani Wang, Shaowen Wang, Kaifeng Lyu

2603.22213 2026-03-23
AI LLM

Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models

Video--based world models have emerged along two dominant paradigms: video generation and 3D reconstruction. However, existing evaluation benchmarks either focus narrowly on visual fidelity and tex...

Meiqi Wu, Zhixin Cai, Fufangchen Zhao, Xiaokun Feng, Rujing Dang, Bingze Song, Ruitian Tian, Jias...

2603.22212 2026-03-23
AI LLM

Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

Multi-agent applications often execute complex tasks as multi-stage workflows, where each stage is an LLM call whose output becomes part of context for subsequent steps. Existing LLM serving system...

Kangqi Ni, Wenyue Hua, Xiaoxiang Shi, Jiang Guo, Shiyu Chang, Tianlong Chen

2603.22206 2026-03-23
AI LLM

CayleyPy-4: AI-Holography. Towards analogs of holographic string dualities for AI tasks

This is the fourth paper in the CayleyPy project, which applies AI methods to the exploration of large graphs. In this work, we suggest the existence of a new discrete version of holographic string...

A. Chervov, F. Levkovich-Maslyuk, A. Smolensky, F. Khafizov, I. Kiselev, D. Melnikov, I. Koltsov,...

2603.22195 2026-03-23
AI LLM

PAM: A Pose-Appearance-Motion Engine for Sim-to-Real HOI Video Generation

Hand-object interaction (HOI) reconstruction and synthesis are becoming central to embodied AI and AR/VR. Yet, despite rapid progress, existing HOI generation research remains fragmented across thr...

Mingju Gao, Kaisen Yang, Huan-ang Gao, Bohan Li, Ao Ding, Wenyi Li, Yangcheng Yu, Jinkun Liu, Sha...

2603.22193 2026-03-23
AI LLM

Enhancing Document-Level Machine Translation via Filtered Synthetic Corpora and Two-Stage LLM Adaptation

In Machine Translation, Large Language Models (LLMs) have generally underperformed compared to conventional encoder-decoder systems and thus see limited adoption. However, LLMs excel at modeling co...

Ireh Kim, Tesia Sker, Chanwoo Kim

2603.22186 2026-03-23
AI LLM

Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

Recent advances in large language models (LLMs) have enabled the automation of an increasing number of programming tasks, including code generation for scientific and engineering domains. In rapidl...

Oscar Novo, Oscar Bastidas-Jossa, Alberto Calvo, Antonio Peris, Carlos Kuchkovsky

2603.22184 2026-03-23
AI LLM

MARCUS: An agentic, multimodal vision-language model for cardiac diagnosis and management

Cardiovascular disease remains the leading cause of global mortality, with progress hindered by human interpretation of complex cardiac tests. Current AI vision-language models are limited to singl...

Jack W O'Sullivan, Mohammad Asadi, Lennart Elbe, Akshay Chaudhari, Tahoura Nedaee, Francois Hadda...

2603.22179 2026-03-23
AI LLM

Causal Evidence that Language Models use Confidence to Drive Behavior

Metacognition -- the ability to assess one's own cognitive performance -- is documented across species, with internal confidence estimates serving as a key signal for adaptive behavior. While confi...

Dharshan Kumaran, Nathaniel Daw, Simon Osindero, Petar Velickovic, Viorica Patraucean

2603.22161 2026-03-23
AI LLM

Multimodal Survival Analysis with Locally Deployable Large Language Models

We study multimodal survival analysis integrating clinical text, tabular covariates, and genomic profiles using locally deployable large language models (LLMs). As many institutions face tight comp...

Moritz Gögl, Christopher Yau

2603.22158 2026-03-23
TESTING

dynActivation: A Trainable Activation Family for Adaptive Nonlinearity

This paper proposes $\mathrm{dynActivation}$, a per-layer trainable activation defined as $f_i(x) = \mathrm{BaseAct}(x)(α_i - β_i) + β_i x$, where $α_i$ and $β_i$ are lightweight learned scalars th...

Alois Bachmann

2603.22154 2026-03-23
AI LLM

More Isn't Always Better: Balancing Decision Accuracy and Conformity Pressures in Multi-AI Advice

Just as people improve decision-making by consulting diverse human advisors, they can now also consult with multiple AI systems. Prior work on group decision-making shows that advice aggregation cr...

Yuta Tsuchiya, Yukino Baba

2603.22152 2026-03-23