Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
AI LLM

A Hybrid Federated Learning Based Ensemble Approach for Lung Disease Diagnosis Leveraging Fusion of SWIN Transformer and CNN

The significant advancements in computational power cre- ate a vast opportunity for using Artificial Intelligence in different ap- plications of healthcare and medical science. A Hybrid FL-Enabled ...

Asif Hasan Chowdhury, Md. Fahim Islam, M Ragib Anjum Riad, Faiyaz Bin Hashem, Md Tanzim Reza, Md....

2602.17566 2026-02-19
AI LLM

ODESteer: A Unified ODE-Based Steering Framework for LLM Alignment

Activation steering, or representation engineering, offers a lightweight approach to align large language models (LLMs) by manipulating their internal activations at inference time. However, curren...

Hongjue Zhao, Haosen Sun, Jiangtao Kong, Xiaochang Li, Qineng Wang, Liwei Jiang, Qi Zhu, Tarek Ab...

2602.17560 2026-02-19
AI LLM

A Theoretical Framework for Modular Learning of Robust Generative Models

Training large-scale generative models is resource-intensive and relies heavily on heuristic dataset weighting. We address two fundamental questions: Can we train Large Language Models (LLMs) modul...

Corinna Cortes, Mehryar Mohri, Yutao Zhong

2602.17554 2026-02-19
AI LLM

MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sample-Efficient LLM Reasoning

Existing Reinforcement Learning with Verifiable Rewards (RLVR) algorithms, such as GRPO, rely on rigid, uniform, and symmetric trust region mechanisms that are fundamentally misaligned with the com...

Xiaoliang Fu, Jiaye Lin, Yangyi Fang, Binbin Zheng, Chaowen Hu, Zekai Shao, Cong Qin, Lu Pan, Ke ...

2602.17550 2026-02-19
AI LLM

KLong: Training LLM Agent for Extremely Long-horizon Tasks

This paper introduces KLong, an open-source LLM agent trained to solve extremely long-horizon tasks. The principle is to first cold-start the model via trajectory-splitting SFT, then scale it via p...

Yue Liu, Zhiyuan Hu, Flood Sung, Jiaheng Zhang, Bryan Hooi

2602.17547 2026-02-19
AI LLM

Evaluating Chain-of-Thought Reasoning through Reusability and Verifiability

In multi-agent IR pipelines for tasks such as search and ranking, LLM-based agents exchange intermediate reasoning in terms of Chain-of-Thought (CoT) with each other. Current CoT evaluation narrowl...

Shashank Aggarwal, Ram Vikas Mishra, Amit Awekar

2602.17544 2026-02-19
AI LLM

Using LLMs for Knowledge Component-level Correctness Labeling in Open-ended Coding Problems

Fine-grained skill representations, commonly referred to as knowledge components (KCs), are fundamental to many approaches in student modeling and learning analytics. However, KC-level correctness ...

Zhangqi Duan, Arnav Kankaria, Dhruv Kartik, Andrew Lan

2602.17542 2026-02-19
AI LLM

Toward a Fully Autonomous, AI-Native Particle Accelerator

This position paper presents a vision for self-driving particle accelerators that operate autonomously with minimal human intervention. We propose that future facilities be designed through artific...

Chris Tennant

2602.17536 2026-02-19
AI LLM

Enhancing Large Language Models (LLMs) for Telecom using Dynamic Knowledge Graphs and Explainable Retrieval-Augmented Generation

Large language models (LLMs) have shown strong potential across a variety of tasks, but their application in the telecom field remains challenging due to domain complexity, evolving standards, and ...

Dun Yuan, Hao Zhou, Xue Liu, Hao Chen, Yan Xin, Jianzhong, Zhang

2602.17529 2026-02-19
AI LLM

When Models Ignore Definitions: Measuring Semantic Override Hallucinations in LLM Reasoning

Large language models (LLMs) demonstrate strong performance on standard digital logic and Boolean reasoning tasks, yet their reliability under locally redefined semantics remains poorly understood....

Yogeswar Reddy Thota, Setareh Rafatirad, Homayoun Houman, Tooraj Nikoubin

2602.17520 2026-02-19
AI LLM

Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems

This work presents a practical benchmarking framework for optimizing artificial intelligence (AI) models on ARM Cortex processors (M0+, M4, M7), focusing on energy efficiency, accuracy, and resourc...

Pranay Jain, Maximilian Kasper, Göran Köber, Axel Plinge, Dominik Seuß

2602.17508 2026-02-19
AI LLM

Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models

Learning from self-sampled data and sparse environmental feedback remains a fundamental challenge in training self-evolving agents. Temporal credit assignment mitigates this issue by transforming s...

Wen-Tse Chen, Jiayu Chen, Fahim Tajwar, Hao Zhu, Xintong Duan, Ruslan Salakhutdinov, Jeff Schneider

2602.17497 2026-02-19
AI LLM

What Do LLMs Associate with Your Name? A Human-Centered Black-Box Audit of Personal Data

Large language models (LLMs), and conversational agents based on them, are exposed to personal data (PD) during pre-training and during user interactions. Prior work shows that PD can resurface, ye...

Dimitri Staufer, Kirsten Morehouse

2602.17483 2026-02-19
AI LLM

ShadAR: LLM-driven shader generation to transform visual perception in Augmented Reality

Augmented Reality (AR) can simulate various visual perceptions, such as how individuals with colorblindness see the world. However, these simulations require developers to predefine each visual eff...

Yanni Mei, Samuel Wendt, Florian Mueller, Jan Gugenheimer

2602.17481 2026-02-19
AI LLM

Small LLMs for Medical NLP: a Systematic Analysis of Few-Shot, Constraint Decoding, Fine-Tuning and Continual Pre-Training in Italian

Large Language Models (LLMs) consistently excel in diverse medical Natural Language Processing (NLP) tasks, yet their substantial computational requirements often limit deployment in real-world hea...

Pietro Ferrazzi, Mattia Franzin, Alberto Lavelli, Bernardo Magnini

2602.17475 2026-02-19
AI LLM

Auditing Reciprocal Sentiment Alignment: Inversion Risk, Dialect Representation and Intent Misalignment in Transformers

The core theme of bidirectional alignment is ensuring that AI systems accurately understand human intent and that humans can trust AI behavior. However, this loop fractures significantly across lan...

Nusrat Jahan Lia, Shubhashis Roy Dipta

2602.17469 2026-02-19
AI LLM

Entropy-Based Data Selection for Language Models

Modern language models (LMs) increasingly require two critical resources: computational resources and data resources. Data selection techniques can effectively reduce the amount of training data re...

Hongming Li, Yang Liu, Chao Huang

2602.17465 2026-02-19
AI LLM

The CTI Echo Chamber: Fragmentation, Overlap, and Vendor Specificity in Twenty Years of Cyber Threat Reporting

Despite the high volume of open-source Cyber Threat Intelligence (CTI), our understanding of long-term threat actor-victim dynamics remains fragmented due to the lack of structured datasets and inc...

Manuel Suarez-Roman, Francesco Marciori, Mauro Conti, Juan Tapiador

2602.17458 2026-02-19
AI LLM

Jolt Atlas: Verifiable Inference via Lookup Arguments in Zero Knowledge

We present Jolt Atlas, a zero-knowledge machine learning (zkML) framework that extends the Jolt proving system to model inference. Unlike zkVMs (zero-knowledge virtual machines), which emulate CPU ...

Wyatt Benno, Alberto Centelles, Antoine Douchet, Khalil Gibran

2602.17452 2026-02-19
AI LLM

Beyond Pipelines: A Fundamental Study on the Rise of Generative-Retrieval Architectures in Web Research

Web research and practices have evolved significantly over time, offering users diverse and accessible solutions across a wide range of tasks. While advanced concepts such as Web 4.0 have emerged f...

Amirereza Abbasi, Mohsen Hooshmand

2602.17450 2026-02-19