Papers
Research papers from arXiv and related sources
ACOS: Arrays of Cheap Optical Switches
Machine learning training places immense demands on cluster networks, motivating specialized architectures and co-design with parallelization strategies. Recent designs incorporating optical circui...
Daniel Amir, Ori Cohen, Jakob Krebs, Mark Silberstein
Do Hackers Dream of Electric Teachers?: A Large-Scale, In-Situ Evaluation of Cybersecurity Student Behaviors and Performance with AI Tutors
To meet the ever-increasing demands of the cybersecurity workforce, AI tutors have been proposed for personalized, scalable education. But, while AI tutors have shown promise in introductory progra...
Michael Tompkins, Nihaarika Agarwal, Ananta Soneji, Robert Wasinger, Connor Nelson, Kevin Leach, ...
ABCD: All Biases Come Disguised
Multiple-choice question (MCQ) benchmarks have been a standard evaluation practice for measuring LLMs' ability to reason and answer knowledge-based questions. Through a synthetic NonsenseQA benchma...
Mateusz Nowak, Xavier Cadet, Peter Chin
AIDG: Evaluating Asymmetry Between Information Extraction and Containment in Multi-Turn Dialogue
Evaluating the strategic reasoning capabilities of Large Language Models (LLMs) requires moving beyond static benchmarks to dynamic, multi-turn interactions. We introduce AIDG (Adversarial Informat...
Adib Sakhawat, Fardeen Sadab, Rakin Shahriar
WarpRec: Unifying Academic Rigor and Industrial Scale for Responsible, Reproducible, and Efficient Recommendation
Innovation in Recommender Systems is currently impeded by a fractured ecosystem, where researchers must choose between the ease of in-memory experimentation and the costly, complex rewriting requir...
Marco Avolio, Potito Aghilar, Sabino Roccotelli, Vito Walter Anelli, Chiara Mallamaci, Vincenzo P...
Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models
Large language models (LLMs) are increasingly used as sources of historical information, motivating the need for scalable audits on contested events and politically charged narratives in settings t...
Francesco Ortu, Joeun Yook, Punya Syon Pandey, Keenan Samway, Bernhard Schölkopf, Alberto Cazzani...
Fine-Grained Uncertainty Quantification for Long-Form Language Model Outputs: A Comparative Study
Uncertainty quantification has emerged as an effective approach to closed-book hallucination detection for LLMs, but existing methods are largely designed for short-form outputs and do not generali...
Dylan Bouchard, Mohit Singh Chauhan, Viren Bajaj, David Skarbrevik
Evaluating Extremely Low-Resource Machine Translation: A Comparative Study of ChrF++ and BLEU Metrics
Evaluating machine translation (MT) quality in extremely low-resource language (ELRL) scenarios poses unique challenges, as widely used metrics such as BLEU, effective in high-resource settings, of...
Sanjeev Kumar, Preethi Jyothi, Pushpak Bhattacharyya
Convergence Analysis of Two-Layer Neural Networks under Gaussian Input Masking
We investigate the convergence guarantee of two-layer neural network training with Gaussian randomly masked inputs. This scenario corresponds to Gaussian dropout at the input level, or noisy input ...
Afroditi Kolomvaki, Fangshuo Liao, Evan Dramko, Ziyun Guang, Anastasios Kyrillidis
A Privacy by Design Framework for Large Language Model-Based Applications for Children
Children are increasingly using technologies powered by Artificial Intelligence (AI). However, there are growing concerns about privacy risks, particularly for children. Although existing privacy r...
Diana Addae, Diana Rogachova, Nafiseh Kahani, Masoud Barati, Michael Christensen, Chen Zhou
DAVE: A Policy-Enforcing LLM Spokesperson for Secure Multi-Document Data Sharing
In current inter-organizational data spaces, usage policies are enforced mainly at the asset level: a whole document or dataset is either shared or withheld. When only parts of a document are sensi...
René Brinkhege, Prahlad Menon
Improving LLM-based Recommendation with Self-Hard Negatives from Intermediate Layers
Large language models (LLMs) have shown great promise in recommender systems, where supervised fine-tuning (SFT) is commonly used for adaptation. Subsequent studies further introduce preference lea...
Bingqian Li, Bowen Zheng, Xiaolei Wang, Long Zhang, Jinpeng Wang, Sheng Chen, Wayne Xin Zhao, Ji-...
Voice-Driven Semantic Perception for UAV-Assisted Emergency Networks
Unmanned Aerial Vehicle (UAV)-assisted networks are increasingly foreseen as a promising approach for emergency response, providing rapid, flexible, and resilient communications in environments whe...
Nuno Saavedra, Pedro Ribeiro, André Coelho, Rui Campos
Insidious Imaginaries: A Critical Overview of AI Speculations
Speculative thinking about the capabilities and implications of artificial intelligence (AI) influences computer science research, drives AI industry practices, feeds academic studies of existentia...
Dejan Grba
The Role of the Availability Heuristic in Multiple-Choice Answering Behaviour
When students are unsure of the correct answer to a multiple-choice question (MCQ), guessing is common practice. The availability heuristic, proposed by A. Tversky and D. Kahneman in 1973, suggests...
Leonidas Zotos, Hedderik van Rijn, Malvina Nissim
RPDR: A Round-trip Prediction-Based Data Augmentation Framework for Long-Tail Question Answering
Long-tail question answering presents significant challenges for large language models (LLMs) due to their limited ability to acquire and accurately recall less common knowledge. Retrieval-augmente...
Yiming Zhang, Siyue Zhang, Junbo Zhao, Chen Zhao
Astra: AI Safety, Trust, & Risk Assessment
This paper argues that existing global AI safety frameworks exhibit contextual blindness towards India's unique socio-technical landscape. With a population of 1.5 billion and a massive informal ec...
Pranav Aggarwal, Ananya Basotia, Debayan Gupta, Rahul Kulkarni, Shalini Kapoor, Kashyap J., A. Mu...
What Breaks Embodied AI Security:LLM Vulnerabilities, CPS Flaws,or Something Else?
Embodied AI systems (e.g., autonomous vehicles, service robots, and LLM-driven interactive agents) are rapidly transitioning from controlled environments to safety critical real-world deployments. ...
Boyang Ma, Hechuan Guo, Peizhuo Lv, Minghui Xu, Xuelong Dai, YeChao Zhang, Yijun Yang, Yue Zhang
PersonaMail: Learning and Adapting Personal Communication Preferences for Context-Aware Email Writing
LLM-assisted writing has seen rapid adoption in interpersonal communication, yet current systems often fail to capture the subtle tones essential for effectiveness. Email writing exemplifies this c...
Rui Yao, Qiuyuan Ren, Felicia Fang-Yi Tan, Chen Yang, Xiaoyu Zhang, Shengdong Zhao
The Sound of Death: Deep Learning Reveals Vascular Damage from Carotid Ultrasound
Cardiovascular diseases (CVDs) remain the leading cause of mortality worldwide, yet early risk detection is often limited by available diagnostics. Carotid ultrasound, a non-invasive and widely acc...
Christoph Balada, Aida Romano-Martinez, Payal Varshney, Vincent ten Cate, Katharina Geschke, Jona...