Papers
Research papers from arXiv and related sources
Mainly on the Plane: Observing the Extended, Ionized Disks of Milky Way Analogs in IllustrisTNG
This paper explores the extent to which the circumgalactic medium (CGM) of Milky Way-like galaxies is located in an extended, ionized, disklike structure. To test this hypothesis, we analyze the sp...
Michael Messere, Kirill Tchernyshyov, Mary E. Putman, Greg L. Bryan, Jessica K. Werk, Yong Zheng,...
EgoGroups: A Benchmark For Detecting Social Groups of People in the Wild
Social group detection, or the identification of humans involved in reciprocal interpersonal interactions (e.g., family members, friends, and customers and merchants), is a crucial component of soc...
Jeffri Murrugarra-Llerena, Pranav Chitale, Zicheng Liu, Kai Ao, Yujin Ham, Guha Balakrishnan, Pao...
RotorMap and Quantum Fingerprints of DNA Sequences via Rotary Position Embeddings
For strings of letters from a small alphabet, such as DNA sequences, we present a quantum encoding that empirically provides a strong correlation between the Levenshtein edit distance and the fidel...
Danylo Yakymenko, Maksym Chernyshev, Illia Savchenko, Sergii Strelchuk
Benchmarking Deep Learning Models for Aerial LiDAR Point Cloud Semantic Segmentation under Real Acquisition Conditions: A Case Study in Navarre
Recent advances in deep learning have significantly improved 3D semantic segmentation, but most models focus on indoor or terrestrial datasets. Their behavior under real aerial acquisition conditio...
Alex Salvatierra, José Antonio Sanz, Christian Gutiérrez, Mikel Galar
SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation
Recent advances in text-to-image (T2I) generation via reinforcement learning (RL) have benefited from reward models that assess semantic alignment and visual quality. However, most existing reward ...
Sashuai Zhou, Qiang Zhou, Junpeng Ma, Yue Cao, Ruofan Hu, Ziang Zhang, Xiaoda Yang, Zhibin Wang, ...
Dyadic: A Scalable Platform for Human-Human and Human-AI Conversation Research
Conversation is ubiquitous in social life, but the empirical study of this interactive process has been thwarted by tools that are insufficiently modular and unadaptive to researcher needs. To reli...
David M. Markowitz
Noise Titration: Exact Distributional Benchmarking for Probabilistic Time Series Forecasting
Modern time series forecasting is evaluated almost entirely through passive observation of single historical trajectories, rendering claims about a model's robustness to non-stationarity fundamenta...
Qilin Wang
Evaluating the Reliability and Fidelity of Automated Judgment Systems of Large Language Models
A Large Language Model (LLM) as judge evaluates the quality of victim Machine Learning (ML) models, specifically LLMs, by analyzing their outputs. An LLM as judge is the combination of one model an...
Tom Biskupski, Stephan Kleber
SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection
While large language models (LLMs) are pretrained on massive amounts of data, their knowledge coverage remains incomplete in specialized, data-scarce domains, motivating extensive efforts to study ...
Kexian Tang, Jiani Wang, Shaowen Wang, Kaifeng Lyu
Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models
Video--based world models have emerged along two dominant paradigms: video generation and 3D reconstruction. However, existing evaluation benchmarks either focus narrowly on visual fidelity and tex...
Meiqi Wu, Zhixin Cai, Fufangchen Zhao, Xiaokun Feng, Rujing Dang, Bingze Song, Ruitian Tian, Jias...
Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs
Multi-agent applications often execute complex tasks as multi-stage workflows, where each stage is an LLM call whose output becomes part of context for subsequent steps. Existing LLM serving system...
Kangqi Ni, Wenyue Hua, Xiaoxiang Shi, Jiang Guo, Shiyu Chang, Tianlong Chen
CayleyPy-4: AI-Holography. Towards analogs of holographic string dualities for AI tasks
This is the fourth paper in the CayleyPy project, which applies AI methods to the exploration of large graphs. In this work, we suggest the existence of a new discrete version of holographic string...
A. Chervov, F. Levkovich-Maslyuk, A. Smolensky, F. Khafizov, I. Kiselev, D. Melnikov, I. Koltsov,...
PAM: A Pose-Appearance-Motion Engine for Sim-to-Real HOI Video Generation
Hand-object interaction (HOI) reconstruction and synthesis are becoming central to embodied AI and AR/VR. Yet, despite rapid progress, existing HOI generation research remains fragmented across thr...
Mingju Gao, Kaisen Yang, Huan-ang Gao, Bohan Li, Ao Ding, Wenyi Li, Yangcheng Yu, Jinkun Liu, Sha...
Enhancing Document-Level Machine Translation via Filtered Synthetic Corpora and Two-Stage LLM Adaptation
In Machine Translation, Large Language Models (LLMs) have generally underperformed compared to conventional encoder-decoder systems and thus see limited adoption. However, LLMs excel at modeling co...
Ireh Kim, Tesia Sker, Chanwoo Kim
Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?
Recent advances in large language models (LLMs) have enabled the automation of an increasing number of programming tasks, including code generation for scientific and engineering domains. In rapidl...
Oscar Novo, Oscar Bastidas-Jossa, Alberto Calvo, Antonio Peris, Carlos Kuchkovsky
MARCUS: An agentic, multimodal vision-language model for cardiac diagnosis and management
Cardiovascular disease remains the leading cause of global mortality, with progress hindered by human interpretation of complex cardiac tests. Current AI vision-language models are limited to singl...
Jack W O'Sullivan, Mohammad Asadi, Lennart Elbe, Akshay Chaudhari, Tahoura Nedaee, Francois Hadda...
Causal Evidence that Language Models use Confidence to Drive Behavior
Metacognition -- the ability to assess one's own cognitive performance -- is documented across species, with internal confidence estimates serving as a key signal for adaptive behavior. While confi...
Dharshan Kumaran, Nathaniel Daw, Simon Osindero, Petar Velickovic, Viorica Patraucean
Multimodal Survival Analysis with Locally Deployable Large Language Models
We study multimodal survival analysis integrating clinical text, tabular covariates, and genomic profiles using locally deployable large language models (LLMs). As many institutions face tight comp...
Moritz Gögl, Christopher Yau
dynActivation: A Trainable Activation Family for Adaptive Nonlinearity
This paper proposes $\mathrm{dynActivation}$, a per-layer trainable activation defined as $f_i(x) = \mathrm{BaseAct}(x)(α_i - β_i) + β_i x$, where $α_i$ and $β_i$ are lightweight learned scalars th...
Alois Bachmann
More Isn't Always Better: Balancing Decision Accuracy and Conformity Pressures in Multi-AI Advice
Just as people improve decision-making by consulting diverse human advisors, they can now also consult with multiple AI systems. Prior work on group decision-making shows that advice aggregation cr...
Yuta Tsuchiya, Yukino Baba