Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

ACOS: Arrays of Cheap Optical Switches

Machine learning training places immense demands on cluster networks, motivating specialized architectures and co-design with parallelization strategies. Recent designs incorporating optical circui...

Daniel Amir, Ori Cohen, Jakob Krebs, Mark Silberstein

2602.17449 2026-02-19
AI LLM

Do Hackers Dream of Electric Teachers?: A Large-Scale, In-Situ Evaluation of Cybersecurity Student Behaviors and Performance with AI Tutors

To meet the ever-increasing demands of the cybersecurity workforce, AI tutors have been proposed for personalized, scalable education. But, while AI tutors have shown promise in introductory progra...

Michael Tompkins, Nihaarika Agarwal, Ananta Soneji, Robert Wasinger, Connor Nelson, Kevin Leach, ...

2602.17448 2026-02-19
AI LLM

ABCD: All Biases Come Disguised

Multiple-choice question (MCQ) benchmarks have been a standard evaluation practice for measuring LLMs' ability to reason and answer knowledge-based questions. Through a synthetic NonsenseQA benchma...

Mateusz Nowak, Xavier Cadet, Peter Chin

2602.17445 2026-02-19
AI LLM

AIDG: Evaluating Asymmetry Between Information Extraction and Containment in Multi-Turn Dialogue

Evaluating the strategic reasoning capabilities of Large Language Models (LLMs) requires moving beyond static benchmarks to dynamic, multi-turn interactions. We introduce AIDG (Adversarial Informat...

Adib Sakhawat, Fardeen Sadab, Rakin Shahriar

2602.17443 2026-02-19
AI LLM

WarpRec: Unifying Academic Rigor and Industrial Scale for Responsible, Reproducible, and Efficient Recommendation

Innovation in Recommender Systems is currently impeded by a fractured ecosystem, where researchers must choose between the ease of in-memory experimentation and the costly, complex rewriting requir...

Marco Avolio, Potito Aghilar, Sabino Roccotelli, Vito Walter Anelli, Chiara Mallamaci, Vincenzo P...

2602.17442 2026-02-19
AI LLM

Preserving Historical Truth: Detecting Historical Revisionism in Large Language Models

Large language models (LLMs) are increasingly used as sources of historical information, motivating the need for scalable audits on contested events and politically charged narratives in settings t...

Francesco Ortu, Joeun Yook, Punya Syon Pandey, Keenan Samway, Bernhard Schölkopf, Alberto Cazzani...

2602.17433 2026-02-19
AI LLM

Fine-Grained Uncertainty Quantification for Long-Form Language Model Outputs: A Comparative Study

Uncertainty quantification has emerged as an effective approach to closed-book hallucination detection for LLMs, but existing methods are largely designed for short-form outputs and do not generali...

Dylan Bouchard, Mohit Singh Chauhan, Viren Bajaj, David Skarbrevik

2602.17431 2026-02-19
AI LLM

Evaluating Extremely Low-Resource Machine Translation: A Comparative Study of ChrF++ and BLEU Metrics

Evaluating machine translation (MT) quality in extremely low-resource language (ELRL) scenarios poses unique challenges, as widely used metrics such as BLEU, effective in high-resource settings, of...

Sanjeev Kumar, Preethi Jyothi, Pushpak Bhattacharyya

2602.17425 2026-02-19
AI LLM

Convergence Analysis of Two-Layer Neural Networks under Gaussian Input Masking

We investigate the convergence guarantee of two-layer neural network training with Gaussian randomly masked inputs. This scenario corresponds to Gaussian dropout at the input level, or noisy input ...

Afroditi Kolomvaki, Fangshuo Liao, Evan Dramko, Ziyun Guang, Anastasios Kyrillidis

2602.17423 2026-02-19
AI LLM

A Privacy by Design Framework for Large Language Model-Based Applications for Children

Children are increasingly using technologies powered by Artificial Intelligence (AI). However, there are growing concerns about privacy risks, particularly for children. Although existing privacy r...

Diana Addae, Diana Rogachova, Nafiseh Kahani, Masoud Barati, Michael Christensen, Chen Zhou

2602.17418 2026-02-19
AI LLM

DAVE: A Policy-Enforcing LLM Spokesperson for Secure Multi-Document Data Sharing

In current inter-organizational data spaces, usage policies are enforced mainly at the asset level: a whole document or dataset is either shared or withheld. When only parts of a document are sensi...

René Brinkhege, Prahlad Menon

2602.17413 2026-02-19
AI LLM

Improving LLM-based Recommendation with Self-Hard Negatives from Intermediate Layers

Large language models (LLMs) have shown great promise in recommender systems, where supervised fine-tuning (SFT) is commonly used for adaptation. Subsequent studies further introduce preference lea...

Bingqian Li, Bowen Zheng, Xiaolei Wang, Long Zhang, Jinpeng Wang, Sheng Chen, Wayne Xin Zhao, Ji-...

2602.17410 2026-02-19
AI LLM

Voice-Driven Semantic Perception for UAV-Assisted Emergency Networks

Unmanned Aerial Vehicle (UAV)-assisted networks are increasingly foreseen as a promising approach for emergency response, providing rapid, flexible, and resilient communications in environments whe...

Nuno Saavedra, Pedro Ribeiro, André Coelho, Rui Campos

2602.17394 2026-02-19
AI LLM

Insidious Imaginaries: A Critical Overview of AI Speculations

Speculative thinking about the capabilities and implications of artificial intelligence (AI) influences computer science research, drives AI industry practices, feeds academic studies of existentia...

Dejan Grba

2602.17383 2026-02-19
AI LLM

The Role of the Availability Heuristic in Multiple-Choice Answering Behaviour

When students are unsure of the correct answer to a multiple-choice question (MCQ), guessing is common practice. The availability heuristic, proposed by A. Tversky and D. Kahneman in 1973, suggests...

Leonidas Zotos, Hedderik van Rijn, Malvina Nissim

2602.17377 2026-02-19
AI LLM

RPDR: A Round-trip Prediction-Based Data Augmentation Framework for Long-Tail Question Answering

Long-tail question answering presents significant challenges for large language models (LLMs) due to their limited ability to acquire and accurately recall less common knowledge. Retrieval-augmente...

Yiming Zhang, Siyue Zhang, Junbo Zhao, Chen Zhao

2602.17366 2026-02-19
AI LLM

Astra: AI Safety, Trust, & Risk Assessment

This paper argues that existing global AI safety frameworks exhibit contextual blindness towards India's unique socio-technical landscape. With a population of 1.5 billion and a massive informal ec...

Pranav Aggarwal, Ananya Basotia, Debayan Gupta, Rahul Kulkarni, Shalini Kapoor, Kashyap J., A. Mu...

2602.17357 2026-02-19
AI LLM

What Breaks Embodied AI Security:LLM Vulnerabilities, CPS Flaws,or Something Else?

Embodied AI systems (e.g., autonomous vehicles, service robots, and LLM-driven interactive agents) are rapidly transitioning from controlled environments to safety critical real-world deployments. ...

Boyang Ma, Hechuan Guo, Peizhuo Lv, Minghui Xu, Xuelong Dai, YeChao Zhang, Yijun Yang, Yue Zhang

2602.17345 2026-02-19
AI LLM

PersonaMail: Learning and Adapting Personal Communication Preferences for Context-Aware Email Writing

LLM-assisted writing has seen rapid adoption in interpersonal communication, yet current systems often fail to capture the subtle tones essential for effectiveness. Email writing exemplifies this c...

Rui Yao, Qiuyuan Ren, Felicia Fang-Yi Tan, Chen Yang, Xiaoyu Zhang, Shengdong Zhao

2602.17340 2026-02-19
AI LLM

The Sound of Death: Deep Learning Reveals Vascular Damage from Carotid Ultrasound

Cardiovascular diseases (CVDs) remain the leading cause of mortality worldwide, yet early risk detection is often limited by available diagnostics. Carotid ultrasound, a non-invasive and widely acc...

Christoph Balada, Aida Romano-Martinez, Payal Varshney, Vincent ten Cate, Katharina Geschke, Jona...

2602.17321 2026-02-19