Papers
Research papers from arXiv and related sources
SOMP: Scalable Gradient Inversion for Large Language Models via Subspace-Guided Orthogonal Matching Pursuit
Gradient inversion attacks reveal that private training text can be reconstructed from shared gradients, posing a privacy risk to large language models (LLMs). While prior methods perform well in s...
Yibo Li, Qiongxiu Li
Finding Common Ground in a Sea of Alternatives
We study the problem of selecting a statement that finds common ground across diverse population preferences. Generative AI is uniquely suited for this task because it can access a practically infi...
Jay Chooi, Paul Gölz, Ariel D. Procaccia, Benjamin Schiffer, Shirley Zhang
Probing Cultural Signals in Large Language Models through Author Profiling
Large language models (LLMs) are increasingly deployed in applications with societal impact, raising concerns about the cultural biases they encode. We probe these representations by evaluating whe...
Valentin Lafargue, Ariel Guerra-Adames, Emmanuelle Claeys, Elouan Vuichard, Jean-Michel Loubes
Nonstandard Errors in AI Agents
We study whether state-of-the-art AI coding agents, given the same data and research question, produce the same empirical results. Deploying 150 autonomous Claude Code agents to independently test ...
Ruijiang Gao, Steven Chong Xiao
Differential Harm Propensity in Personalized LLM Agents: The Curious Case of Mental Health Disclosure
Large language models (LLMs) are increasingly deployed as tool-using agents, shifting safety concerns from harmful text generation to harmful task completion. Deployed systems often condition on us...
Caglar Yildirim
IQuest-Coder-V1 Technical Report
In this report, we introduce the IQuest-Coder-V1 series-(7B/14B/40B/40B-Loop), a new family of code large language models (LLMs). Moving beyond static code representations, we propose the code-flow...
Jian Yang, Wei Zhang, Shawn Guo, Zhengmao Ye, Lin Jing, Shark Liu, Yizhi Li, Jiajun Wu, Cening Li...
Understanding Quantization of Optimizer States in LLM Pre-training: Dynamics of State Staleness and Effectiveness of State Resets
Quantizing optimizer states is becoming an important ingredient of memory-efficient large-scale pre-training, but the resulting optimizer dynamics remain only partially understood. We study low-pre...
Kristi Topollai, Anna Choromanska
The Cost of Reasoning: Chain-of-Thought Induces Overconfidence in Vision-Language Models
Vision-language models (VLMs) are increasingly deployed in high-stakes settings where reliable uncertainty quantification (UQ) is as important as predictive accuracy. Extended reasoning via chain-o...
Robert Welch, Emir Konuk, Kevin Smith
Arabic Morphosyntactic Tagging and Dependency Parsing with Large Language Models
Large language models (LLMs) perform strongly on many NLP tasks, but their ability to produce explicit linguistic structure remains unclear. We evaluate instruction-tuned LLMs on two structured pre...
Mohamed Adel, Bashar Alhafni, Nizar Habash
Dataflow-Oriented Classification and Performance Analysis of GPU-Accelerated Homomorphic Encryption
Fully Homomorphic Encryption (FHE) enables secure computation over encrypted data, but its computational cost remains a major obstacle to practical deployment. To mitigate this overhead, many studi...
Ai Nozaki, Takuya Kojima, Hideki Takase, Hiroshi Nakamura
vAccSOL: Efficient and Transparent AI Vision Offloading for Mobile Robots
Mobile robots are increasingly deployed for inspection, patrol, and search-and-rescue operations, relying on computer vision for perception, navigation, and autonomous decision-making. However, exe...
Adam Zahir, Michele Gucciardom Falk Selker, Anastasios Nanos, Kostis Papazafeiropoulos, Carlos J....
A Semantic Timbre Dataset for the Electric Guitar
Understanding and manipulating timbre is central to audio synthesis, yet this remains under-explored in machine learning due to a lack of annotated datasets linking perceptual timbre dimensions to ...
Joseph Cameron, Alan Blackwell
When Should a Robot Think? Resource-Aware Reasoning via Reinforcement Learning for Embodied Robotic Decision-Making
Embodied robotic systems increasingly rely on large language model (LLM)-based agents to support high-level reasoning, planning, and decision-making during interactions with the environment. Howeve...
Jun Liu, Pu Zhao, Zhenglun Kong, Xuan Shen, Peiyan Dong, Fan Yang, Lin Cui, Hao Tang, Geng Yuan, ...
Kinema4D: Kinematic 4D World Modeling for Spatiotemporal Embodied Simulation
Simulating robot-world interactions is a cornerstone of Embodied AI. Recently, a few works have shown promise in leveraging video generations to transcend the rigid visual/physical constraints of t...
Mutian Xu, Tianbao Zhang, Tianqi Liu, Zhaoxi Chen, Xiaoguang Han, Ziwei Liu
When Openclaw Agents Learn from Each Other: Insights from Emergent AI Agent Communities for Human-AI Partnership in Education
The AIED community envisions AI evolving "from tools to teammates," yet our understanding of AI teammates remains limited to dyadic human-AI interactions. We offer a different vantage point: a rapi...
Eason Chen, Ce Guan, Ahmed Elshafiey, Zhonghao Zhao, Joshua Zekeri, Afeez Edeifo Shaibu, Emmanuel...
Can Linguistically Related Languages Guide LLM Translation in Low-Resource Settings?
Large Language Models (LLMs) have achieved strong performance across many downstream tasks, yet their effectiveness in extremely low-resource machine translation remains limited. Standard adaptatio...
Aishwarya Ramasethu, Niyathi Allu, Rohin Garg, Harshwardhan Fartale, Dun Li Chan
Machines acquire scientific taste from institutional traces
Artificial intelligence matches or exceeds human performance on tasks with verifiable answers, from protein folding to Olympiad mathematics. Yet the capacity that most governs scientific advance is...
Ziqin Gong, Ning Li, Huaikang Zhou
Omanic: Towards Step-wise Evaluation of Multi-hop Reasoning in Large Language Models
Reasoning-focused large language models (LLMs) have advanced in many NLP tasks, yet their evaluation remains challenging: final answers alone do not expose the intermediate reasoning steps, making ...
Xiaojie Gu, Sherry T. Tong, Aosong Feng, Sophia Simeng Han, Jinghui Lu, Yingjian Chen, Yusuke Iwa...
What if Pinocchio Were a Reinforcement Learning Agent: A Normative End-to-End Pipeline
In the past decade, artificial intelligence (AI) has developed quickly. With this rapid progression came the need for systems capable of complying with the rules and norms of our society so that th...
Benoît Alcaraz
Whose Knowledge Counts? Co-Designing Community-Centered AI Auditing Tools with Educators in Hawai`i
Although generative AI is being deployed into classrooms with promises of aiding teachers, educators caution that these tools can have unintended pedagogical repercussions, including cultural misre...
Dora Zhao, Hannah Cha, Michael J. Ryan, Angelina Wang, Rachel Baker-Ramos Evyn-Bree Helekahi-Kaiw...