Papers
Research papers from arXiv and related sources
Human or Machine? A Preliminary Turing Test for Speech-to-Speech Interaction
The pursuit of human-like conversational agents has long been guided by the Turing test. For modern speech-to-speech (S2S) systems, a critical yet unanswered question is whether they can converse l...
Xiang Li, Jiabao Gao, Sipei Lin, Xuan Zhou, Chi Zhang, Bo Cheng, Jiale Han, Benyou Wang
Ecological memory of hydrodynamic cues shapes growth and migration of motile microorganisms
Microorganisms live in inherently dynamic environments where fluctuations in biotic and abiotic factors shape their behaviour, physiology, and fitness. The concept of ecological memory: the lasting...
Narges Kakavand, Anupam Sengupta
A Novel Hierarchical Multi-Agent System for Payments Using LLMs
Large language model (LLM) agents, such as OpenAI's Operator and Claude's Computer Use, can automate workflows but unable to handle payment tasks. Existing agentic solutions have gained significant...
Joon Kiat Chua, Donghao Huang, Zhaoxia Wang
Task Complexity Matters: An Empirical Study of Reasoning in LLMs for Sentiment Analysis
Large language models (LLMs) with reasoning capabilities have fueled a compelling narrative that reasoning universally improves performance across language tasks. We test this claim through a compr...
Donghao Huang, Zhaoxia Wang
CIRCLE: A Framework for Evaluating AI from a Real-World Lens
This paper proposes CIRCLE, a six-stage, lifecycle-based framework to bridge the reality gap between model-centric performance metrics and AI's materialized outcomes in deployment. While existing f...
Reva Schwartz, Carina Westling, Morgan Briggs, Marzieh Fadaee, Isar Nejadgholi, Matthew Holmes, F...
Unsupervised Baseline Clustering and Incremental Adaptation for IoT Device Traffic Profiling
The growth and heterogeneity of IoT devices create security challenges where static identification models can degrade as traffic evolves. This paper presents a two-stage, flow-feature-based pipelin...
Sean M. Alderman, John D. Hastings
Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving
Large Language Model (LLM) adapters enable low-cost model specialization, but introduce complex caching and scheduling challenges in distributed serving systems where hundreds of adapters must be h...
Ferran Agullo, Joan Oliveras, Chen Wang, Alberto Gutierrez-Torre, Olivier Tardieu, Alaa Youssef, ...
Spatio-Temporal Garment Reconstruction Using Diffusion Mapping via Pattern Coordinates
Reconstructing 3D clothed humans from monocular images and videos is a fundamental problem with applications in virtual try-on, avatar creation, and mixed reality. Despite significant progress in h...
Yingxuan You, Ren Li, Corentin Dumery, Cong Cao, Hao Li, Pascal Fua
RewardUQ: A Unified Framework for Uncertainty-Aware Reward Models
Reward models are central to aligning large language models (LLMs) with human preferences. Yet most approaches rely on pointwise reward estimates that overlook the epistemic uncertainty in reward m...
Daniel Yang, Samuel Stante, Florian Redhardt, Lena Libon, Parnian Kassraie, Ido Hakimi, Barna Pás...
Designing AI Tutors for Interest-Based Learning: Insights from Human Instructors
Interest-based learning (IBL) is a paradigm of instruction in which educational content is contextualized using learners' interests to enhance content relevance. IBL has been shown to result in imp...
Abhishek Kulkarni, Sharon Lynn Chu
A Quality Framework for Testing Gravity with Wide Binaries: No Evidence for MOND
Wide binaries (WBs) offer a unique opportunity to test gravity in the low-acceleration regime, where modifications such as Milgromian dynamics (MOND) predict measurable deviations from Newtonian gr...
Stephen A. Cookson, Indranil Banik, Kareem El-Badry, Will Sutherland, Zephyr Penoyre, Charalambos...
GuardAlign: Test-time Safety Alignment in Multimodal Large Language Models
Large vision-language models (LVLMs) have achieved remarkable progress in vision-language reasoning tasks, yet ensuring their safety remains a critical challenge. Recent input-side defenses detect ...
Xingyu Zhu, Beier Zhu, Junfeng Fang, Shuo Wang, Yin Zhang, Xiang Wang, Xiangnan He
Breaking the Illusion of Artificial Consensus: Clone-Robust Weighting for Arbitrary Metric Spaces
Independent media are central to democratic decision-making, yet recent technological developments, such as social media, pseudonymous identities, and generative AI, have made them more vulnerable ...
Damien Berriaud, Roger Wattenhofer
Cross-order induced behaviors in contagion dynamics on higher-order networks
Recent studies have shown that novel collective behaviors emerge in complex systems due to higher-order interactions. However, the way in which the structural correlations of these interactions sha...
Kaloyan Danovski, Sandro Meloni, Michele Starnini
Steering and Rectifying Latent Representation Manifolds in Frozen Multi-modal LLMs for Video Anomaly Detection
Video anomaly detection (VAD) aims to identify abnormal events in videos. Traditional VAD methods generally suffer from the high costs of labeled data and full training, thus some recent works have...
Zhaolin Cai, Fan Li, Huiyu Duan, Lijun He, Guangtao Zhai
Interpretable Debiasing of Vision-Language Models for Social Fairness
The rapid advancement of Vision-Language models (VLMs) has raised growing concerns that their black-box reasoning processes could lead to unintended forms of social bias. Current debiasing approach...
Na Min An, Yoonna Jang, Yusuke Hirota, Ryo Hachiuma, Isabelle Augenstein, Hyunjung Shim
LeGend: A Data-Driven Framework for Lemma Generation in Hardware Model Checking
Property checking of RTL designs is a central task in formal verification. Among available engines, IC3/PDR is a widely used backbone whose performance critically depends on inductive generalizatio...
Mingkai Miao, Guangyu Hu, Wei Zhang, Hongce Zhang
Jailbreak Foundry: From Papers to Runnable Attacks for Reproducible Benchmarking
Jailbreak techniques for large language models (LLMs) evolve faster than benchmarks, making robustness estimates stale and difficult to compare across papers due to drift in datasets, harnesses, an...
Zhicheng Fang, Jingjie Zheng, Chenxu Fu, Wei Xu
Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments
The next generation of autonomous agents must not only learn efficiently but also act reliably and adapt their behavior in open worlds. Standard approaches typically assume fixed tasks and environm...
Florent Delgrange
Large-scale portfolio optimization on a trapped-ion quantum computer
We present an end-to-end pipeline for large-scale portfolio selection with cardinality constraints and experimentally demonstrate it on trapped-ion quantum processors using hardware-aware decomposi...
Alejandro Gomez Cadavid, Ananth Kaushik, Pranav Chandarana, Miguel Angel Lopez-Ruiz, Gaurav Dev, ...