Papers
Research papers from arXiv and related sources
Noise-Aware Misclassification Attack Detection in Collaborative DNN Inference
Collaborative inference of object classification Deep neural Networks (DNNs) where resource-constrained end-devices offload partially processed data to remote edge servers to complete end-to-end pr...
Shima Yousefi, Saptarshi Debroy
Pretrained Multilingual Transformers Reveal Quantitative Distance Between Human Languages
Understanding the distance between human languages is central to linguistics, anthropology, and tracing human evolutionary history. Yet, while linguistics has long provided rich qualitative account...
Yue Zhao, Jiatao Gu, Paloma Jeretič, Weijie Su
Actionable Recourse in Competitive Environments: A Dynamic Game of Endogenous Selection
Actionable recourse studies whether individuals can modify feasible features to overturn unfavorable outcomes produced by AI-assisted decision-support systems. However, many such systems operate in...
Ya-Ting Yang, Quanyan Zhu
Differential Privacy in Generative AI Agents: Analysis and Optimal Tradeoffs
Large language models (LLMs) and AI agents are increasingly integrated into enterprise systems to access internal databases and generate context-aware responses. While such integration improves pro...
Ya-Ting Yang, Quanyan Zhu
Grievance Politics vs. Policy Debates: A Cross-Platform Analysis of Conservative Discourse on Truth Social and Reddit
We present the first large-scale comparative analysis of Truth Social and the most popular conservative Reddit communities, r/Conservative, r/conservatives, and r/Republican. Using topic modeling w...
Yining Wang, Alhasan Abdellatif, Artemis Deligianni, Hannah Hok, Yusuf Mucahit Cetinkaya, Tugrulc...
Workers' Incentives and the Optimal Taxation of AI
We characterize the optimal tax policy in an economy with human manual and cognitive labor, physical capital, and artificial intelligence (AI). Extending the dynamic taxation setup of Slavik and Ya...
Jakub Growiec, Klaus Prettner, Maciej Szkróbka
A Creative Agent is Worth a 64-Token Template
Text-to-image (T2I) models have substantially improved image fidelity and prompt adherence, yet their creativity remains constrained by reliance on discrete natural language prompts. When presented...
Ruixiao Shi, Fu Feng, Yucheng Xie, Xu Yang, Jing Wang, Xin Geng
scicode-lint: Detecting Methodology Bugs in Scientific Python Code with LLM-Generated Patterns
Methodology bugs in scientific Python code produce plausible but incorrect results that traditional linters and static analysis tools cannot detect. Several research groups have built ML-specific l...
Sergey V. Samsonau
RAMP: Reinforcement Adaptive Mixed Precision Quantization for Efficient On Device LLM Inference
Post training quantization is essential for deploying large language models (LLMs) on resource constrained hardware, yet state of the art methods enforce uniform bit widths across layers, yielding ...
Arpit Singh Gautam, Saurabh Jha
AI-Assisted Goal Setting Improves Goal Progress Through Social Accountability
Helping people identify and pursue personally meaningful career goals at scale remains a key challenge in applied psychology. Career coaching can improve goal quality and attainment, but its cost a...
Michel Schimpf, Julian Voigt, Thomas Bohné
DebugLM: Learning Traceable Training Data Provenance for LLMs
Large language models (LLMs) are trained through multi-stage pipelines over heterogeneous data sources, yet developers lack a principled way to pinpoint the specific data responsible for an observe...
Wenjie Jacky Mo, Qin Liu, Xiaofei Wen, Wenxuan Zhou, Zhe Zhao, Muhao Chen
Mitigating LLM Hallucinations through Domain-Grounded Tiered Retrieval
Large Language Models (LLMs) have achieved unprecedented fluency but remain susceptible to "hallucinations" - the generation of factually incorrect or ungrounded content. This limitation is particu...
Md. Asraful Haque, Aasar Mehdi, Maaz Mahboob, Tamkeen Fatima
Procedural Generation of Algorithm Discovery Tasks in Machine Learning
Automating the development of machine learning algorithms has the potential to unlock new breakthroughs. However, our ability to improve and evaluate algorithm discovery systems has thus far been l...
Alexander D. Goldie, Zilin Wang, Adrian Hayler, Deepak Nathani, Edan Toledo, Ken Thampiratwong, A...
Physics-Aware Machine Learning for Seismic and Volcanic Signal Interpretation
Modern seismic and volcanic monitoring is increasingly shaped by continuous, multi-sensor observations and by the need to extract actionable information from nonstationary, noisy wavefields. In thi...
William Thorossian
Revisiting foundation models for cell instance segmentation
Cell segmentation is a fundamental task in microscopy image analysis. Several foundation models for cell segmentation have been introduced, virtually all of them are extensions of Segment Anything ...
Anwai Archit, Constantin Pape
How do LLMs Compute Verbal Confidence
Verbal confidence -- prompting LLMs to state their confidence as a number or category -- is widely used to extract uncertainty estimates from black-box models. However, how LLMs internally generate...
Dharshan Kumaran, Arthur Conmy, Federico Barbero, Simon Osindero, Viorica Patraucean, Petar Velic...
Event-Centric Human Value Understanding in News-Domain Texts: An Actor-Conditioned, Multi-Granularity Benchmark
Existing human value datasets do not directly support value understanding in factual news: many are actor-agnostic, rely on isolated utterances or synthetic scenarios, and lack explicit event struc...
Yao Wang, Xin Liu, Zhuochen Liu, Jiankang Chen, Adam Jatowt, Kyoungsook Kim, Noriko Kando, Haitao Yu
ArchBench: Benchmarking Generative-AI for Software Architecture Tasks
Benchmarks for large language models (LLMs) have progressed from snippet-level function generation to repository-level issue resolution, yet they overwhelmingly target implementation correctness. S...
Bassam Adnan, Aviral Gupta, Sreemaee Akshathala, Karthik Vaidhyanathan
Text-to-Stage: Spatial Layouts from Long-form Narratives
In this work, we probe the ability of a language model to demonstrate spatial reasoning from unstructured text, mimicking human capabilities and automating a process that benefits many downstream m...
Jefferson Hernandez, Swarnadeep Saha, Chenxi Whitehouse, Sanjeel Parekh, Calvin Murdock, Yuliang ...
RPMS: Enhancing LLM-Based Embodied Planning through Rule-Augmented Memory Synergy
LLM agents often fail in closed-world embodied environments because actions must satisfy strict preconditions -- such as location, inventory, and container states -- and failure feedback is sparse....
Zhenhang Yuan, Shenghai Yuan, Lihua Xie