Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
TESTING

Polynomial Identity Testing and Reconstruction for Depth-4 Powering Circuits of High Degree

We study deterministic polynomial identity testing (PIT) and reconstruction algorithms for depth-$4$ arithmetic circuits of the form \[ Σ^{[r]}\!\wedge^{[d]}\!Σ^{[s]}\!Π^{[δ]}. \] This model genera...

Amir Shpilka, Yann Tal

2602.20832 2026-02-24
TESTING

Rethinking Clause Management for CDCL SAT Solvers

Boolean Satisfiability (SAT) solving underpins a wide range of applications in Electronic Design Automation (EDA), particularly formal verification. However, this paper observes that the mainstream...

Yalun Cai, Xindi Zhang, Zhengyuan Shi, Mengxia Tao, Qiang Xu

2602.20829 2026-02-24
AI LLM

Pressure Reveals Character: Behavioural Alignment Evaluation at Depth

Evaluating alignment in language models requires testing how they behave under realistic pressure, not just what they claim they would do. While alignment failures increasingly cause real-world har...

Nora Petrova, John Burden

2602.20813 2026-02-24
AI LLM

Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset

As the construction industry advances toward digital transformation, BIM (Building Information Modeling)-based design has become a key driver supporting intelligent construction. Despite Large Lang...

Jia-Rui Lin, Yun-Hong Cai, Xiang-Rui Ni, Shaojie Zhou, Peng Pan

2602.20812 2026-02-24
TESTING

Probing Dec-POMDP Reasoning in Cooperative MARL

Cooperative multi-agent reinforcement learning (MARL) is typically framed as a decentralised partially observable Markov decision process (Dec-POMDP), a setting whose hardness stems from two key ch...

Kale-ab Tessera, Leonard Hinckeldey, Riccardo Zamboni, David Abel, Amos Storkey

2602.20804 2026-02-24
AI LLM

Mitigating Preference Leakage via Strict Estimator Separation for Normative Generative Ranking

In Generative Information Retrieval (GenIR), the bottleneck has shifted from generation to the selection of candidates, particularly for normative criteria such as cultural relevance. Current LLM-a...

Dalia Nahhas, Xiaohao Cai, Imran Razzak, Shoaib Jameel

2602.20800 2026-02-24
AI LLM

Unseen-Codebases-Domain Data Synthesis and Training Based on Code Graphs

In the context of newly release software frameworks, large language models (LLMs) often exhibit poor performance and a high rate of hallucination, as they are not exposed to such environments durin...

Guangsheng Ou, Qiming Zhang, Sirong Chen, Anji Li, Dong Xu, Tiancheng Luo, Dekun Dai, Cuiyun Gao,...

2602.20799 2026-02-24
TESTING

Enabling FR2-5G Communication with Dielectric OAM Transmitarrays

This paper investigates the potential of near-field (NF) indoor communications in the FR2 frequency bands using fully dielectric structures to generate orbital angular momentum (OAM) waves. All-die...

Miguel Á. Balmaseda-Márquez, Juan E. Galeote-Cazorla, Álvaro Liébana-Bolívar, Alejandro Ramírez-A...

2602.20777 2026-02-24
AI LLM

Federated Learning for Cross-Modality Medical Image Segmentation via Augmentation-Driven Generalization

Artificial intelligence has emerged as a transformative tool in medical image analysis, yet developing robust and generalizable segmentation models remains difficult due to fragmented, privacy-cons...

Sachin Dudda Nagaraju, Ashkan Moradi, Bendik Skarre Abrahamsen, Mattijs Elschot

2602.20773 2026-02-24
AI LLM

Pipeline for Verifying LLM-Generated Mathematical Solutions

With the growing popularity of Large Reasoning Models and their results in solving mathematical problems, it becomes crucial to measure their capabilities. We introduce a pipeline for both automati...

Varvara Sazonova, Dmitri Shmelkin, Stanislav Kikot, Vasily Motolygin

2602.20770 2026-02-24
AI LLM

Overton Pluralistic Reinforcement Learning for Large Language Models

Existing alignment paradigms remain limited in capturing the pluralistic nature of human values. Overton Pluralism addresses this gap by generating responses with diverse perspectives from a single...

Yu Fu, Seongho Son, Ilija Bogunovic

2602.20759 2026-02-24
AI LLM

SibylSense: Adaptive Rubric Learning via Memory Tuning and Adversarial Probing

Designing aligned and robust rewards for open-ended generation remains a key barrier to RL post-training. Rubrics provide structured, interpretable supervision, but scaling rubric construction is d...

Yifei Xu, Guilherme Potje, Shivam Shandilya, Tiancheng Yuan, Leonardo de Oliveira Nunes, Rakshand...

2602.20751 2026-02-24
TESTING

Atomic Spectroscopy Probes of New Physics

Precision spectroscopy has long played a central role in testing the foundations of physics, from the early insights that led to the development of quantum mechanics to the validation of quantum el...

Cédric Delaunay, Jean-Philippe Karr, Yotam Soreq

2602.20750 2026-02-24
TESTING

Evidence of a non-equipartition energy regime in 1803+784 Core-shift and Faraday rotation measurements from simultaneous multi-frequency polarimetric VGOS observations

Context. Compact jets from active galactic nuclei (AGN) are commonly assumed to be in equipartition between particle and magnetic-field energy densities at the regions where the radio emission domi...

V. Pérez-Díez, I. Martí-Vidal, E. Albentosa-Ruiz, R. Bachiller

2602.20746 2026-02-24
AI LLM

Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization

Anonymizing textual documents is a highly context-sensitive problem: the appropriate balance between privacy protection and utility preservation varies with the data domain, privacy objectives, and...

Gabriel Loiseau, Damien Sileo, Damien Riquet, Maxime Meyer, Marc Tommasi

2602.20743 2026-02-24
AI LLM

RMIT-ADM+S at the MMU-RAG NeurIPS 2025 Competition

This paper presents the award-winning RMIT-ADM+S system for the Text-to-Text track of the NeurIPS~2025 MMU-RAG Competition. We introduce Routing-to-RAG (R2RAG), a research-focused retrieval-aug...

Kun Ran, Marwah Alaofi, Danula Hettiachchi, Chenglong Ma, Khoi Nguyen Dinh Anh, Khoi Vo Nguyen, S...

2602.20735 2026-02-24
AI LLM

CHESS: Context-aware Hierarchical Efficient Semantic Selection for Long-Context LLM Inference

Long-context LLMs demand accurate inference at low latency, yet decoding becomes primarily constrained by KV cache as context grows. Prior pruning methods are largely context-agnostic: their token ...

Chao Fei, Guozhong Li, Chenxi Liu, Panos Kalnis

2602.20732 2026-02-24
AI LLM

Balancing Multiple Objectives in Urban Traffic Control with Reinforcement Learning from AI Feedback

Reward design has been one of the central challenges for real world reinforcement learning (RL) deployment, especially in settings with multiple objectives. Preference-based RL offers an appealing ...

Chenyang Zhao, Vinny Cahill, Ivana Dusparic

2602.20728 2026-02-24
AI LLM

ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition

LoRA has become a universal Parameter-Efficient Fine-Tuning (PEFT) technique that equips Large Language Models (LLMs) to adapt quickly to new tasks. However, when these models are scaled up, even t...

Xindian Ma, Rundong Kong, Peng Zhang, Ruoxiang Huang, Yongyu Jiang

2602.20727 2026-02-24
AI LLM

Bridging Physically Based Rendering and Diffusion Models with Stochastic Differential Equation

Diffusion-based image generators excel at producing realistic content from text or image conditions, but they offer only limited explicit control over low-level, physically grounded shading and mat...

Junwei Shu, Wenjie Liu, Changgu Chen, Hantang Liu, Yang Li, Changbo Wang

2602.20725 2026-02-24