Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
TESTING

A Kernel Two-Sample Test Invariant under Group Action with Applications to Functional Data

We introduce a kernel-based two-sample test for comparing probability distributions up to group actions. Our construction yields invariant kernels for locally compact $σ$-compact groups and extends...

Madison Giacofci, Anouar Meynaoui, Alex Podgorny

2603.16294 2026-03-17
TESTING

VisBrowse-Bench: Benchmarking Visual-Native Search for Multimodal Browsing Agents

The rapid advancement of Multimodal Large Language Models (MLLMs) has enabled browsing agents to acquire and reason over multimodal information in the real world. But existing benchmarks suffer fro...

Zhengbo Zhang, Jinbo Su, Zhaowen Zhou, Changtao Miao, Yuhan Hong, Qimeng Wu, Yumeng Liu, Feier Wu...

2603.16289 2026-03-17
AI LLM

CAST-TTS: A Simple Cross-Attention Framework for Unified Timbre Control in TTS

Current Text-to-Speech (TTS) systems typically use separate models for speech-prompted and text-prompted timbre control. While unifying both control signals into a single model is desirable, the ch...

Zihao Zheng, Wen Wu, Chao Zhang, Mengyue Wu, Xuenan Xu

2603.16280 2026-03-17
TESTING

VIGOR: VIdeo Geometry-Oriented Reward for Temporal Generative Alignment

Video diffusion models lack explicit geometric supervision during training, leading to inconsistency artifacts such as object deformation, spatial drift, and depth violations in generated videos. T...

Tengjiao Yin, Jinglei Shi, Heng Guo, Xi Wang

2603.16271 2026-03-17
AI LLM

Adaptive Theory of Mind for LLM-based Multi-Agent Coordination

Theory of Mind (ToM) refers to the ability to reason about others' mental states, and higher-order ToM involves considering that others also possess their own ToM. Equipping large language model (L...

Chunjiang Mu, Ya Zeng, Qiaosheng Zhang, Kun Shao, Chen Chu, Hao Guo, Danyang Jia, Zhen Wang, Shuy...

2603.16264 2026-03-17
AI LLM

Human/AI Collective Intelligence for Deliberative Democracy: A Human-Centred Design Approach

This chapter introduces the concept of Collective Intelligence for Deliberative Democracy (CI4DD). We propose that the use of computational tools, specifically artificial intelligence to advance de...

Anna De Liddo, Lucas Anastasiou, Simon Buckingham Shum

2603.16260 2026-03-17
AI LLM

When Thinking Hurts: Mitigating Visual Forgetting in Video Reasoning via Frame Repetition

Recently, Multimodal Large Language Models (MLLMs) have demonstrated significant potential in complex visual tasks through the integration of Chain-of-Thought (CoT) reasoning. However, in Video Que...

Xiaokun Sun, Yubo Wang, Haoyu Cao, Linli Xu

2603.16256 2026-03-17
AI LLM

Grounding the Score: Explicit Visual Premise Verification for Reliable Vision-Language Process Reward Models

Vision-language process reward models (VL-PRMs) are increasingly used to score intermediate reasoning steps and rerank candidates under test-time scaling. However, they often function as black-box ...

Junxin Wang, Dai Guan, Weijie Qiu, Zhihang Li, Yongbo Gai, Zhengyi Yang, Mengyu Zhou, Erchao Zhao...

2603.16253 2026-03-17
AI LLM

Visual Prompt Discovery via Semantic Exploration

LVLMs encounter significant challenges in image understanding and visual reasoning, leading to critical perception failures. Visual prompts, which incorporate image manipulation code, have shown pr...

Jaechang Kim, Yotaro Shimose, Zhao Wang, Kuang-Da Wang, Jungseul Ok, Shingo Takamatsu

2603.16250 2026-03-17
AI LLM

How to Utilize Complementary Vision-Text Information for 2D Structure Understanding

LLMs typically linearize 2D tables into 1D sequences to fit their autoregressive architecture, which weakens row-column adjacency and other layout cues. In contrast, purely visual encoders can capt...

Jiancheng Dong, Pengyue Jia, Derong Xu, Jiawei Cheng, Jingyu Peng, Chao Zhang, Bowen Liu, Xin Sun...

2603.16245 2026-03-17
AI LLM

More Rounds, More Noise: Why Multi-Turn Review Fails to Improve Cross-Context Verification

Cross-Context Review (CCR) improves LLM verification by separating production and review into independent sessions. A natural extension is multi-turn review: letting the reviewer ask follow-up ques...

Song Tae-Eun

2603.16244 2026-03-17
TESTING

Industrial cuVSLAM Benchmark & Integration

This work presents a comprehensive benchmark evaluation of visual odometry (VO) and visual SLAM (VSLAM) systems for mobile robot navigation in real-world logistical environments. We compare multipl...

Charbel Abi Hana, Kameel Amareen, Mohamad Mostafa, Dmitry Slepichev, Hesam Rabeti, Zheng Wang, Mi...

2603.16240 2026-03-17
TESTING

Neural Pushforward Samplers for the Fokker-Planck Equation on Embedded Riemannian Manifolds

We extend the Weak Adversarial Neural Pushforward (WANPF) Method to the Fokker--Planck equation posed on a compact, smoothly embedded Riemannian manifold M in $R^n$. The key observation is that the...

Andrew Qing He, Wei Cai

2603.16239 2026-03-17
TESTING

SpecSteer: Synergizing Local Context and Global Reasoning for Efficient Personalized Generation

Realizing personalized intelligence faces a core dilemma: sending user history to centralized large language models raises privacy concerns, while on-device small language models lack the reasoning...

Hang Lv, Sheng Liang, Hao Wang, Yongyue Zhang, Hongchao Gu, Wei Guo, Defu Lian, Yong Liu, Enhong ...

2603.16219 2026-03-17
TESTING

Equivalence testing with data-dependent and post-hoc equivalence margins

Equivalence testing compares the hypothesis that an effect $μ$ is large against the alternative that it is negligible. Here, `large' is classically expressed as being larger than some `equivalence ...

Stan Koobs, Nick W. Koning

2603.16213 2026-03-17
TESTING

Rapid Worst-Case Gust Identification for Very Flexible Aircraft Using Reduced-Order Models

Identification of worst-case gust loads is a critical step in the certification of very flexible aircraft, yet the computational cost of nonlinear full-order simulations renders exhaustive parametr...

Nikolaos D. Tantaroudas, Andrea Da Ronch, Ilias Karachalios, Kenneth J. Badcock

2603.16212 2026-03-17
TESTING

Leveling3D: Leveling Up 3D Reconstruction with Feed-Forward 3D Gaussian Splatting and Geometry-Aware Generation

Feed-forward 3D reconstruction has revolutionized 3D vision, providing a powerful baseline for downstream tasks such as novel-view synthesis with 3D Gaussian Splatting. Previous works explore fixin...

Yiming Huang, Baixiang Huang, Beilei Cui, Chi Kit Ng, Long Bai, Hongliang Ren

2603.16211 2026-03-17
TESTING

Weak Adversarial Neural Pushforward Method for the McKean-Vlasov / Mean-Field Fokker-Planck Equation

We extend the Weak Adversarial Neural Pushforward Method (WANPM) to the McKean-Vlasov mean-field Fokker-Planck equation. For the quadratic interaction kernel, the mean-field nonlinearity reduces to...

Andrew Qing He, Wei Cai

2603.16186 2026-03-17
TESTING

Homogeneous and Heterogeneous Consistency progressive Re-ranking for Visible-Infrared Person Re-identification

Visible-infrared person re-identification faces greater challenges than traditional person re-identification due to the significant differences between modalities. In particular, the differences be...

Yiming Wang

2603.16165 2026-03-17
TESTING

Execution-Grounded Credit Assignment for GRPO in Code Generation

Critic-free reinforcement learning with verifiable rewards (RLVR) improves code generation by optimizing unit-test pass rates, but GRPO-style updates suffer from coarse credit assignment: a single ...

Abhijit Kumar, Natalya Kumar, Shikhar Gupta

2603.16158 2026-03-17