Papers
Research papers from arXiv and related sources
Embedded Quantum Machine Learning in Embedded Systems: Feasibility, Hybrid Architectures, and Quantum Co-Processors
Embedded quantum machine learning (EQML) seeks to bring quantum machine learning (QML) capabilities to resource-constrained edge platforms such as IoT nodes, wearables, drones, and cyber-physical c...
Somdip Dey, Syed Muhammad Raza
Do You See What I Am Pointing At? Gesture-Based Egocentric Video Question Answering
Understanding and answering questions based on a user's pointing gesture is essential for next-generation egocentric AI assistants. However, current Multimodal Large Language Models (MLLMs) struggl...
Yura Choi, Roy Miles, Rolandos Alexandros Potamias, Ismail Elezi, Jiankang Deng, Stefanos Zafeiriou
LLM BiasScope: A Real-Time Bias Analysis Platform for Comparative LLM Evaluation
As large language models (LLMs) are deployed widely, detecting and understanding bias in their outputs is critical. We present LLM BiasScope, a web application for side-by-side comparison of LLM ou...
Himel Ghosh, Nick Elias Werner
When LLM Judge Scores Look Good but Best-of-N Decisions Fail
Large language models are often used as judges to score candidate responses, then validated with a single global metric such as correlation with reference labels. This can be misleading when the re...
Eddie Landesberg
How Fair is Software Fairness Testing?
Software fairness testing is a central method for evaluating AI systems, yet the meaning of fairness is often treated as fixed and universally applicable. This vision paper positions fairness testi...
Ann Barcomb, Mariana Pinheiro Bento, Giuseppe Destefanis, Sherlock Licorish, Cleyton Magalhães, R...
ELLA: Generative AI-Powered Social Robots for Early Language Development at Home
Early language development shapes children's later literacy and learning, yet many families have limited access to scalable, high-quality support at home. Recent advances in generative AI make it p...
Victor Nikhil Antony, Shiye Cao, Shuning Wang, Chien-Ming Huang
TRACE: Temporal Rule-Anchored Chain-of-Evidence on Knowledge Graphs for Interpretable Stock Movement Prediction
We present a Temporal Rule-Anchored Chain-of-Evidence (TRACE) on knowledge graphs for interpretable stock movement prediction that unifies symbolic relational priors, dynamic graph exploration, and...
Qianggang Ding, Haochen Shi, Luis Castejón Lozano, Miguel Conner, Juan Abia, Luis Gallego-Ledesma...
Modal Logical Neural Networks for Financial AI
The financial industry faces a critical dichotomy in AI adoption: deep learning often delivers strong empirical performance, while symbolic logic offers interpretability and rule adherence expected...
Antonin Sulc
Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing
Multi-modal large language models (MLLMs) have advanced general-purpose video understanding but struggle with long, high-resolution videos -- they process every pixel equally in their vision transf...
Baifeng Shi, Stephanie Fu, Long Lian, Hanrong Ye, David Eigen, Aaron Reite, Boyi Li, Jan Kautz, S...
Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training
Reasoning LLMs-as-Judges, which can benefit from inference-time scaling, provide a promising path for extending the success of reasoning models to non-verifiable domains where the output correctnes...
Yixin Liu, Yue Yu, DiJia Su, Sid Wang, Xuewei Wang, Song Jiang, Bo Liu, Arman Cohan, Yuandong Tia...
Security Considerations for Artificial Intelligence Agents
This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and recommendations concerning the security of frontier AI...
Ninghui Li, Kaiyuan Zhang, Kyle Polley, Jerry Ma
Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration
Despite interdisciplinary research leading to larger and longer-term impact, most work remains confined to single-domain academic silos. Recent AI-based approaches to scientific discovery show prom...
Priyanka Kargupta, Shuhaib Mehri, Dilek Hakkani-Tur, Jiawei Han
CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks
State space models (SSMs) like Mamba have gained significant traction as efficient alternatives to Transformers, achieving linear complexity while maintaining competitive performance. However, Hidd...
Alexandre Le Mercier, Thomas Demeester, Chris Develder
Long-Context Encoder Models for Polish Language Understanding
While decoder-only Large Language Models (LLMs) have recently dominated the NLP landscape, encoder-only architectures remain a cost-effective and parameter-efficient standard for discriminative tas...
Sławomir Dadas, Rafał Poświata, Marek Kozłowski, Małgorzata Grębowiec, Michał Perełkiewicz, Paweł...
BehaviorVLM: Unified Finetuning-Free Behavioral Understanding with Vision-Language Reasoning
Understanding freely moving animal behavior is central to neuroscience, where pose estimation and behavioral understanding form the foundation for linking neural activity to natural actions. Yet bo...
Jingyang Ke, Weihan Li, Amartya Pradhan, Jeffrey Markowitz, Anqi Wu
QAQ: Bidirectional Semantic Coherence for Selecting High-Quality Synthetic Code Instructions
Synthetic data has become essential for training code generation models, yet it introduces significant noise and hallucinations that are difficult to detect with current metrics. Existing data sele...
Jiayin Lei, Ming Ma, Yunxi Duan, Chenxi Li, Tianming Yang
Investigating student perceptions of creativity and generative ai in computational physics
Generative Artificial Intelligence (gen-AI) is rapidly becoming more integrated into today's classrooms in all ranges of education. In higher education, Gen-AI is often seen as a resource for stude...
Pachi Her, Patti Hamerski
LifeSim: Long-Horizon User Life Simulator for Personalized Assistant Evaluation
The rapid advancement of large language models (LLMs) has accelerated progress toward universal AI assistants. However, existing benchmarks for personalized assistants remain misaligned with real-w...
Feiyu Duan, Xuanjing Huang, Zhongyu Wei
IsoCompute Playbook: Optimally Scaling Sampling Compute for LLM RL
While scaling laws guide compute allocation for LLM pre-training, analogous prescriptions for reinforcement learning (RL) post-training of large language models (LLMs) remain poorly understood. We ...
Zhoujun Cheng, Yutao Xie, Yuxiao Qu, Amrith Setlur, Shibo Hao, Varad Pimpalkhute, Tongtong Liang,...
Increasing intelligence in AI agents can worsen collective outcomes
When resources are scarce, will a population of AI agents coordinate in harmony, or descend into tribal chaos? Diverse decision-making AI from different developers is entering everyday devices -- f...
Neil F. Johnson