Papers
Research papers from arXiv and related sources
Uncertainty Mitigation and Intent Inference: A Dual-Mode Human-Machine Joint Planning System
Effective human-robot collaboration in open-world environments requires joint planning under uncertain conditions. However, existing approaches often treat humans as passive supervisors, preventing...
Zeyu Fang, Yuxin Lin, Cheng Liu, Beomyeol Yu, Zeyuan Yang, Rongqian Chen, Taeyoung Lee, Mahdi Ima...
Broken Access: On the Challenges of Screen Reader Assisted Two-Factor and Passwordless Authentication
In today's technology-driven world, web services have opened up new opportunities for blind and visually impaired people to interact independently. Securing interactions with these services is cruc...
Md Mojibur Rahman Redoy Akanda, Ahmed Tanvir Mahdad, Nitesh Saxena
Learning embeddings of non-linear PDEs: the Burgers' equation
Embeddings provide low-dimensional representations that organize complex function spaces and support generalization. They provide a geometric representation that supports efficient retrieval, compa...
Pedro Tarancón-Álvarez, Leonid Sarieddine, Pavlos Protopapas, Raul Jimenez
Machine Learning for Electrode Materials: Property Prediction via Composition
In this work, we benchmark three leading Machine Learning (ML) frameworks-MODNet, CrabNet, and a random forest model based on Magpie feature-for predicting properties of battery electrode materials...
Hao Wu, Cameron Hargreaves, Arpit Mishra, Gian-Marco Rignanese
Inverse Resistive Force Theory (I-RFT): Learning granular properties through robot-terrain physical interactions
For robots to navigate safely and efficiently on soft, granular terrains, it is crucial to gather information about the terrain's mechanical properties, which directly affect locomotion performance...
Shipeng Liu, Feng Xue, Yifeng Zhang, Tarunika Ponnusamy, Feifei Qian
A trigonometric approach to an identity by Ramanujan
An identity by Ramanujan is expressed using polar coordinates, so that its proof reduces to the verification of an elementary trigonometric identity. This approach produces a few variations on Rama...
C. Vignat
OrdinalBench: A Benchmark Dataset for Diagnosing Generalization Limits in Ordinal Number Understanding of Vision-Language Models
Vision-Language Models (VLMs) have advanced across multimodal benchmarks but still show clear gaps in ordinal number understanding, i.e., the ability to track relative positions and generalize to l...
Yusuke Tozaki, Hisashi Miyamori
Evolution of density perturbations in fractional Newtonian cosmology
In this work, density perturbations are investigated within the framework of a fractional Newtonian cosmology. Focusing on the matter-dominated era and employing the fluid-flow approach, the growth...
S. M. M. Rasouli
BEVLM: Distilling Semantic Knowledge from LLMs into Bird's-Eye View Representations
The integration of Large Language Models (LLMs) into autonomous driving has attracted growing interest for their strong reasoning and semantic understanding abilities, which are essential for handl...
Thomas Monninger, Shaoyuan Xie, Qi Alfred Chen, Sihao Ding
SUREON: A Benchmark and Vision-Language-Model for Surgical Reasoning
Surgeons don't just see -- they interpret. When an expert observes a surgical scene, they understand not only what instrument is being used, but why it was chosen, what risk it poses, and what come...
Alejandra Perez, Anita Rau, Lee White, Busisiwe Mlambo, Chinedu Nwoye, Muhammad Abdullah Jamal, O...
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
Vision Language Model (VLM) development has largely relied on scaling model size, which hinders deployment on compute-constrained mobile and edge devices such as smartphones and robots. In this wor...
Boqiang Zhang, Lei Ke, Ruihan Yang, Qi Gao, Tianyuan Qu, Rossell Chen, Dong Yu, Leoweiliang
The Pen: Episodic Cognitive Assistance via an Ear-Worn Interface
Wearable AI is often designed as always-available, yet continuous availability can conflict with how people work and socialize, creating discomfort around privacy, disruption, and unclear system bo...
Yonatan Tussa, Andy Heredia
KCLarity at SemEval-2026 Task 6: Encoder and Zero-Shot Approaches to Political Evasion Detection
This paper describes the KCLarity team's participation in CLARITY, a shared task at SemEval 2026 on classifying ambiguity and evasion techniques in political discourse. We investigate two modelling...
Archie Sage, Salvatore Greco
Understanding and Finding JIT Compiler Performance Bugs
Just-in-time (JIT) compilers are key components for many popular programming languages with managed runtimes (e.g., Java and JavaScript). JIT compilers perform optimizations and generate native cod...
Zijian Yi, Cheng Ding, August Shi, Milos Gligoric
RAMoEA-QA: Hierarchical Specialization for Robust Respiratory Audio Question Answering
Conversational generative AI is rapidly entering healthcare, where general-purpose models must integrate heterogeneous patient signals and support diverse interaction styles while producing clinica...
Gaia A. Bertolino, Yuwei Zhang, Tong Xia, Domenico Talia, Cecilia Mascolo
Evaluating the Predictability of Selected Weather Extremes with Aurora, an AI Weather Forecast Model
AI weather foundation models now achieve forecast skill comparable to numerical weather prediction at far lower computational cost, yet their predictability for high-impact extremes across dynamica...
Qin Huang, Moyan Liu, Yeongbin Kwon, Upmanu Lall
Galaxy UV Legacy Project: Survey Description and First Insights Into NGC 4449 Recent History of Star Formation
The Galaxy UV Legacy Project (GULP) is a Cycle 28 Treasury program with the Hubble Space Telescope (HST) designed to characterize resolved massive stars, OB associations, and young star clusters (Y...
E. Sabbi, B. Meena, P. Zeidler, V. Bajaj, D. Calzetti, J. J. Eldridge, P. Facchini, S. Linden, P....
When One Modality Rules Them All: Backdoor Modality Collapse in Multimodal Diffusion Models
While diffusion models have revolutionized visual content generation, their rapid adoption has underscored the critical need to investigate vulnerabilities, e.g., to backdoor attacks. In multimodal...
Qitong Wang, Haoran Dai, Haotian Zhang, Christopher Rasmussen, Binghui Wang
Beyond Rows to Reasoning: Agentic Retrieval for Multimodal Spreadsheet Understanding and Editing
Recent advances in multimodal Retrieval-Augmented Generation (RAG) enable Large Language Models (LLMs) to analyze enterprise spreadsheet workbooks containing millions of cells, cross-sheet dependen...
Anmol Gulati, Sahil Sen, Waqar Sarguroh, Kevin Paul
CFEAR-Teach-and-Repeat: Fast and Accurate Radar-only Localization
Reliable localization in prior maps is essential for autonomous navigation, particularly under adverse weather, where optical sensors may fail. We present CFEAR-TR, a teach-and-repeat localization ...
Maximilian Hilger, Daniel Adolfsson, Ralf Becker, Henrik Andreasson, Achim J. Lilienthal