Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
TESTING

Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously

Online Video Large Language Models (VideoLLMs) play a critical role in supporting responsive, real-time interaction. Existing methods focus on streaming perception, lacking a synchronized logical r...

Yiran Guan, Liang Yin, Dingkang Liang, Jianzhong Ju, Zhenbo Luo, Jian Luan, Yuliang Liu, Xiang Bai

2603.12262 2026-03-12
TESTING

Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training

Humans perceive and understand real-world spaces through a stream of visual observations. Therefore, the ability to streamingly maintain and update spatial evidence from potentially unbounded video...

Fangfu Liu, Diankun Wu, Jiawei Chi, Yimo Cai, Yi-Hsin Hung, Xumin Yu, Hao Li, Han Hu, Yongming Ra...

2603.12255 2026-03-12
TESTING

Thermalisation as Diffusion in Hilbert Space

We develop a microscopic theory of thermalisation for a thermometer coupled to a many-body bath beyond standard Markovian and Fermi-golden-rule assumptions. By modeling interaction matrix elements ...

Aleksey Lunkin

2603.12234 2026-03-12
TESTING

Incremental Neural Network Verification via Learned Conflicts

Neural network verification is often used as a core component within larger analysis procedures, which generate sequences of closely related verification queries over the same network. In existing ...

Raya Elsaleh, Liam Davis, Haoze Wu, Guy Katz

2603.12232 2026-03-12
TESTING

Language Model Teams as Distributed Systems

Large language models (LLMs) are growing increasingly capable, prompting recent interest in LLM teams. Yet, despite increased deployment of LLM teams at scale, we lack a principled framework for ad...

Elizabeth Mieczkowski, Katherine M. Collins, Ilia Sucholutsky, Natalia Vélez, Thomas L. Griffiths

2603.12229 2026-03-12
TESTING

Conformalized Data-Driven Reachability Analysis with PAC Guarantees

Data-driven reachability analysis computes over-approximations of reachable sets directly from noisy data. Existing deterministic methods require either known noise bounds or system-specific struct...

Yanliang Huang, Zhen Zhang, Peng Xie, Zhuoqi Zeng, Amr Alanwar

2603.12220 2026-03-12
TESTING

A blended approach for evolving phase fields using peridynamics: Cyclic loading in quasi-brittle fracture

A field theory is presented for predicting damage and fracture in quasi brittle materials incorporating effects of irreversible (plastic) deformation as well as elastic moduli that soften with dama...

Hayden Bromley, Robert Lipton

2603.12210 2026-03-12
TESTING

Shifted-geodesic approximation for spinning-body gravitational wave fluxes

We present a shifted-geodesic framework for computing gravitational-wave fluxes from spinning test bodies moving on bound orbits of Kerr black holes. The method provides a simple and efficient mean...

Lisa V. Drummond, Scott A. Hughes, Viktor Skoupý, Philip Lynch, Gabriel Andres Piovano

2603.12189 2026-03-12
TESTING

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Multimodal agents offer a promising path to automating complex document-intensive workflows. Yet, a critical question remains: do these agents demonstrate genuine strategic reasoning, or merely sto...

Łukasz Borchmann, Jordy Van Landeghem, Michał Turski, Shreyansh Padarha, Ryan Othniel Kearns, Ada...

2603.12180 2026-03-12
TESTING

Linking Perception, Confidence and Accuracy in MLLMs

Recent advances in Multi-modal Large Language Models (MLLMs) have predominantly focused on enhancing visual perception to improve accuracy. However, a critical question remains unexplored: Do model...

Yuetian Du, Yucheng Wang, Rongyu Zhang, Zhijie Xu, Boyu Yang, Ming Kong, Jie Liu, Qiang Zhu

2603.12149 2026-03-12
TESTING

Automatic Generation of High-Performance RL Environments

Translating complex reinforcement learning (RL) environments into high-performance implementations has traditionally required months of specialized engineering. We present a reusable recipe - a gen...

Seth Karten, Rahul Dev Appapogu, Chi Jin

2603.12145 2026-03-12
TESTING

TopoBench: Benchmarking LLMs on Hard Topological Reasoning

Solving topological grid puzzles requires reasoning over global spatial invariants such as connectivity, loop closure, and region symmetry and remains challenging for even the most powerful large l...

Mayug Maniparambil, Nils Hoehing, Janak Kapuriya, Arjun Karuvally, Ellen Rushe, Anthony Ventresqu...

2603.12133 2026-03-12
TESTING

Cross-Context Review: Improving LLM Output Quality by Separating Production and Review Sessions

Large language models struggle to catch errors in their own outputs when the review happens in the same session that produced them. This paper introduces Cross-Context Review (CCR), a straightforwa...

Tae-Eun Song

2603.12123 2026-03-12
TESTING

CRAFT: A Tendon-Driven Hand with Hybrid Hard-Soft Compliance

We introduce CRAFT hand, a tendon-driven anthropomorphic hand with hybrid hard-soft compliance for contact-rich manipulation. The design is based on a simple idea: contact is not uniform across the...

Leo Lin, Shivansh Patel, Jay Moon, Svetlana Lazebnik, Unnat Jain

2603.12120 2026-03-12
TESTING

SommBench: Assessing Sommelier Expertise of Language Models

With the rapid advances of large language models, it becomes increasingly important to systematically evaluate their multilingual and multicultural capabilities. Previous cultural evaluation benchm...

William Brach, Tomas Bedej, Jacob Nielsen, Jacob Pichna, Juraj Bedej, Eemeli Saarensilta, Julie D...

2603.12117 2026-03-12
TESTING

EmbTracker: Traceable Black-box Watermarking for Federated Language Models

Federated Language Model (FedLM) allows a collaborative learning without sharing raw data, yet it introduces a critical vulnerability, as every untrustworthy client may leak the received functional...

Haodong Zhao, Jinming Hu, Yijie Bai, Tian Dong, Wei Du, Zhuosheng Zhang, Yanjiao Chen, Haojin Zhu...

2603.12089 2026-03-12
TESTING

Direct Boltzmann inversion method from particle configurations at arbitrary state points

We introduce a direct Boltzmann inversion method to infer the interaction potential in particle systems using as input particle configurations generated at an arbitrary state point of the system. U...

Olivier Coquand, Davide Paolino, Ludovic Berthier

2603.12081 2026-03-12
TESTING

Paper Title: LoV3D: Grounding Cognitive Prognosis Reasoning in Longitudinal 3D Brain MRI via Regional Volume Assessments

Longitudinal brain MRI is essential for characterizing the progression of neurological diseases such as Alzheimer's disease assessment. However, current deep-learning tools fragment this process: c...

Zhaoyang Jiang, Zhizhong Fu, David McAllister, Yunsoo Kim, Honghan Wu

2603.12071 2026-03-12
TESTING

Numerical benchmark for damage identification in Structural Health Monitoring

The availability of a dataset for validation and verification purposes of novel data-driven strategies and/or hybrid physics-data approaches is currently one of the most pressing challenges in the ...

Francesca Marafini, Giacomo Zini, Alberto Barontini, Nuno Mendes, Alice Cicirello, Michele Betti,...

2603.12069 2026-03-12
TESTING

Translationese as a Rational Response to Translation Task Difficulty

Translations systematically diverge from texts originally produced in the target language, a phenomenon widely referred to as translationese. Translationese has been attributed to production tenden...

Maria Kunilovskaya

2603.12050 2026-03-12