Papers
Research papers from arXiv and related sources
UXSim: Towards a Hybrid User Search Simulation
Simulating nuanced user experiences within complex interactive search systems poses distinct challenge for traditional methodologies, which often rely on static user proxies or, more recently, on s...
Saber Zerhoudi, Michael Granitzer
SafeGen-LLM: Enhancing Safety Generalization in Task Planning for Robotic Systems
Safety-critical task planning in robotic systems remains challenging: classical planners suffer from poor scalability, Reinforcement Learning (RL)-based methods generalize poorly, and base Large La...
Jialiang Fan, Weizhe Xu, Mengyu Liu, Oleg Sokolsky, Insup Lee, Fangxin Kong
Anansi: Scalable Characterization of Message-Based Job Scams
Job-based smishing scams, where victims are recruited under the guise of remote job opportunities, represent a rapidly growing and understudied threat within the broader landscape of online fraud. ...
Abisheka Pitumpe, Amir Rahmati
Controllable Reasoning Models Are Private Thinkers
AI agents powered by reasoning models require access to sensitive user data. However, their reasoning traces are difficult to control, which can result in the unintended leakage of private informat...
Haritz Puerto, Haonan Li, Xudong Han, Timothy Baldwin, Iryna Gurevych
An Efficient Unsupervised Federated Learning Approach for Anomaly Detection in Heterogeneous IoT Networks
Federated learning (FL) is an effective paradigm for distributed environments such as the Internet of Things (IoT), where data from diverse devices with varying functionalities remains localized wh...
Mohsen Tajgardan, Atena Shiranzaei, Mahdi Rabbani, Reza Khoshkangini, Mahtab Jamali
Beyond Explainable AI (XAI): An Overdue Paradigm Shift and Post-XAI Research Directions
This study provides a cross-disciplinary examination of Explainable Artificial Intelligence (XAI) approaches-focusing on deep neural networks (DNNs) and large language models (LLMs)-and identifies ...
Saleh Afroogh, Seyd Ishtiaque Ahmed, Petra Ahrweiler, David Alvarez-Melis, Mansur Maturidi Arief,...
LemmaBench: A Live, Research-Level Benchmark to Evaluate LLM Capabilities in Mathematics
We present a new approach for benchmarking Large Language Model (LLM) capabilities on research-level mathematics. Existing benchmarks largely rely on static, hand-curated sets of contest or textboo...
Antoine Peyronnet, Fabian Gloeckle, Amaury Hayat
ArgLLM-App: An Interactive System for Argumentative Reasoning with Large Language Models
Argumentative LLMs (ArgLLMs) are an existing approach leveraging Large Language Models (LLMs) and computational argumentation for decision-making, with the aim of making the resulting decisions fai...
Adam Dejl, Deniz Gorur, Francesca Toni
What You Read is What You Classify: Highlighting Attributions to Text and Text-Like Inputs
At present, there are no easily understood explainable artificial intelligence (AI) methods for discrete token inputs, like text. Most explainable AI techniques do not extend well to token sequence...
Daniel S. Berman, Brian Merritt, Stanley Ta, Dana Udwin, Amanda Ernlund, Jeremy Ratcliff, Vijay N...
"Make It Sound Like a Lawyer Wrote It": Scenarios of Potential Impacts of Generative AI for Legal Conflict Resolution
Generative AI (GenAI) tools are transforming critical societal domains, including the legal sector. While these tools create opportunities such as increased efficiency and potential improvements in...
Kimon Kieslich, Natali Helberger, Nicholas Diakopoulos
Terminology Rarity Predicts Catastrophic Failure in LLM Translation of Low-Resource Ancient Languages: Evidence from Ancient Greek
This study presents the first systematic, reference-free human evaluation of large language model (LLM) machine translation (MT) for Ancient Greek (AG) technical prose. We evaluate translations by ...
James L. Zainaldin, Cameron Pattison, Manuela Marai, Jacob Wu, Mark J. Schiefsky
Agentic AI-RAN: Enabling Intent-Driven, Explainable and Self-Evolving Open RAN Intelligence
Open RAN (O-RAN) exposes rich control and telemetry interfaces across the Non-RT RIC, Near-RT RIC, and distributed units, but also makes it harder to operate multi-tenant, multi-objective RANs in a...
Zhizhou He, Yang Luo, Xinkai Liu, Mahdi Boloursaz Mashhadi, Mohammad Shojafar, Merouane Debbah, R...
ARGUS: Seeing the Influence of Narrative Features on Persuasion in Argumentative Texts
Can narratives make arguments more persuasive? And to this end, which narrative features matter most? Although stories are often seen as powerful tools for persuasion, their specific role in online...
Sara Nabhani, Federico Pianzola, Khalid Al-Khatib, Malvina Nissim
Artificial Agency Program: Curiosity, compression, and communication in agents
This paper presents the Artificial Agency Program (AAP), a position and research agenda for building AI systems as reality embedded, resource-bounded agents whose development is driven by curiosity...
Richard Csaky
Bi-level RL-Heuristic Optimization for Real-world Winter Road Maintenance
Winter road maintenance is critical for ensuring public safety and reducing environmental impacts, yet existing methods struggle to manage large-scale routing problems effectively and mostly reply ...
Yue Xie, Zizhen Xu, William Beazley, Fumiya Iida
The impacts of artificial intelligence on environmental sustainability and human well-being
Artificial Intelligence (AI) is changing the world, but its impacts on the environment and human well-being remain uncertain. We conducted a systematic literature review of 1,291 studies selected f...
Noemi Luna Carmeno, Tiago Domingos, Daniel W. O'Neill
Precision Studies and Searches for CP Asymmetries in the Inclusive Decay $Λ_{c}^{+}\to ΛX$
Based on $e^+e^-$ annihilation data collected with the BESIII detector at center-of-mass energies from 4.600 to 4.699 GeV, corresponding to an integrated luminosity of 4.5 fb$^{-1}$, we present the...
BESIII Collaboration, M. Ablikim, M. N. Achasov, P. Adlarson, X. C. Ai, C. S. Akondi, R. Alibert...
Shaping the Digital Future of ErUM Research: Sustainability & Ethics
This workshop report from "Shaping the Digital Future of ErUM Research: Sustainability & Ethics" (Aachen, 2025) reviews progress on sustainability measures in data-intensive ErUM-Data research sinc...
Luca Di Bella, Jan Bürger, Markus Demleitner, Torsten Enßlin, Johannes Erdmann, Martin Erdmann, B...
The Subjectivity of Monoculture
Machine learning models -- including large language models (LLMs) -- are often said to exhibit monoculture, where outputs agree strikingly often. But what does it actually mean for models to agree ...
Nathanael Jo, Nikhil Garg, Manish Raghavan
Preference Packing: Efficient Preference Optimization for Large Language Models
Resource-efficient training optimization techniques are becoming increasingly important as the size of large language models (LLMs) continues to grow. In particular, batch packing is commonly used ...
Jaekyung Cho