Papers
Research papers from arXiv and related sources
Do You See What I Am Pointing At? Gesture-Based Egocentric Video Question Answering
Understanding and answering questions based on a user's pointing gesture is essential for next-generation egocentric AI assistants. However, current Multimodal Large Language Models (MLLMs) struggl...
Yura Choi, Roy Miles, Rolandos Alexandros Potamias, Ismail Elezi, Jiankang Deng, Stefanos Zafeiriou
LLM BiasScope: A Real-Time Bias Analysis Platform for Comparative LLM Evaluation
As large language models (LLMs) are deployed widely, detecting and understanding bias in their outputs is critical. We present LLM BiasScope, a web application for side-by-side comparison of LLM ou...
Himel Ghosh, Nick Elias Werner
When LLM Judge Scores Look Good but Best-of-N Decisions Fail
Large language models are often used as judges to score candidate responses, then validated with a single global metric such as correlation with reference labels. This can be misleading when the re...
Eddie Landesberg
Trajectory probing of complex-frequency scattering with chirped analytic pulses
Characterizing resonant scatterers is challenging because their poles and zeros usually lie away from the real-frequency axis, whereas most measurements sample only real frequencies and infer off-a...
Alex Krasnok, Denis Seletskiy
Gaussian and bootstrap approximations for functional principal component regression
Asymptotic inference using functional principal component regression (FPCR) has long been considered difficult, largely because, upon any scalar scaling, the FPCR estimator fails to satisfy a centr...
Hyemin Yeon
How Fair is Software Fairness Testing?
Software fairness testing is a central method for evaluating AI systems, yet the meaning of fairness is often treated as fixed and universally applicable. This vision paper positions fairness testi...
Ann Barcomb, Mariana Pinheiro Bento, Giuseppe Destefanis, Sherlock Licorish, Cleyton Magalhães, R...
ELLA: Generative AI-Powered Social Robots for Early Language Development at Home
Early language development shapes children's later literacy and learning, yet many families have limited access to scalable, high-quality support at home. Recent advances in generative AI make it p...
Victor Nikhil Antony, Shiye Cao, Shuning Wang, Chien-Ming Huang
TRACE: Temporal Rule-Anchored Chain-of-Evidence on Knowledge Graphs for Interpretable Stock Movement Prediction
We present a Temporal Rule-Anchored Chain-of-Evidence (TRACE) on knowledge graphs for interpretable stock movement prediction that unifies symbolic relational priors, dynamic graph exploration, and...
Qianggang Ding, Haochen Shi, Luis Castejón Lozano, Miguel Conner, Juan Abia, Luis Gallego-Ledesma...
Modal Logical Neural Networks for Financial AI
The financial industry faces a critical dichotomy in AI adoption: deep learning often delivers strong empirical performance, while symbolic logic offers interpretability and rule adherence expected...
Antonin Sulc
Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously
Online Video Large Language Models (VideoLLMs) play a critical role in supporting responsive, real-time interaction. Existing methods focus on streaming perception, lacking a synchronized logical r...
Yiran Guan, Liang Yin, Dingkang Liang, Jianzhong Ju, Zhenbo Luo, Jian Luan, Yuliang Liu, Xiang Bai
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training
Humans perceive and understand real-world spaces through a stream of visual observations. Therefore, the ability to streamingly maintain and update spatial evidence from potentially unbounded video...
Fangfu Liu, Diankun Wu, Jiawei Chi, Yimo Cai, Yi-Hsin Hung, Xumin Yu, Hao Li, Han Hu, Yongming Ra...
Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing
Multi-modal large language models (MLLMs) have advanced general-purpose video understanding but struggle with long, high-resolution videos -- they process every pixel equally in their vision transf...
Baifeng Shi, Stephanie Fu, Long Lian, Hanrong Ye, David Eigen, Aaron Reite, Boyi Li, Jan Kautz, S...
Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training
Reasoning LLMs-as-Judges, which can benefit from inference-time scaling, provide a promising path for extending the success of reasoning models to non-verifiable domains where the output correctnes...
Yixin Liu, Yue Yu, DiJia Su, Sid Wang, Xuewei Wang, Song Jiang, Bo Liu, Arman Cohan, Yuandong Tia...
Thermalisation as Diffusion in Hilbert Space
We develop a microscopic theory of thermalisation for a thermometer coupled to a many-body bath beyond standard Markovian and Fermi-golden-rule assumptions. By modeling interaction matrix elements ...
Aleksey Lunkin
Incremental Neural Network Verification via Learned Conflicts
Neural network verification is often used as a core component within larger analysis procedures, which generate sequences of closely related verification queries over the same network. In existing ...
Raya Elsaleh, Liam Davis, Haoze Wu, Guy Katz
Security Considerations for Artificial Intelligence Agents
This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and recommendations concerning the security of frontier AI...
Ninghui Li, Kaiyuan Zhang, Kyle Polley, Jerry Ma
Language Model Teams as Distributed Systems
Large language models (LLMs) are growing increasingly capable, prompting recent interest in LLM teams. Yet, despite increased deployment of LLM teams at scale, we lack a principled framework for ad...
Elizabeth Mieczkowski, Katherine M. Collins, Ilia Sucholutsky, Natalia Vélez, Thomas L. Griffiths
Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration
Despite interdisciplinary research leading to larger and longer-term impact, most work remains confined to single-domain academic silos. Recent AI-based approaches to scientific discovery show prom...
Priyanka Kargupta, Shuhaib Mehri, Dilek Hakkani-Tur, Jiawei Han
Conformalized Data-Driven Reachability Analysis with PAC Guarantees
Data-driven reachability analysis computes over-approximations of reachable sets directly from noisy data. Existing deterministic methods require either known noise bounds or system-specific struct...
Yanliang Huang, Zhen Zhang, Peng Xie, Zhuoqi Zeng, Amr Alanwar
A blended approach for evolving phase fields using peridynamics: Cyclic loading in quasi-brittle fracture
A field theory is presented for predicting damage and fracture in quasi brittle materials incorporating effects of irreversible (plastic) deformation as well as elastic moduli that soften with dama...
Hayden Bromley, Robert Lipton