Research

Papers

Research papers from arXiv and related sources

Total: 4513 AI/LLM: 2483 Testing: 2030
TESTING

Seeing Farther and Smarter: Value-Guided Multi-Path Reflection for VLM Policy Optimization

Solving complex, long-horizon robotic manipulation tasks requires a deep understanding of physical interactions, reasoning about their long-term consequences, and precise high-level planning. Visio...

Yanting Yang, Shenyuan Gao, Qingwen Bu, Li Chen, Dimitris N. Metaxas

2602.19372 2026-02-22
TESTING

LLMs Can Learn to Reason Via Off-Policy RL

Reinforcement learning (RL) approaches for Large Language Models (LLMs) frequently use on-policy algorithms, such as PPO or GRPO. However, policy lag from distributed training architectures and dif...

Daniel Ritter, Owen Oertell, Bradley Guo, Jonathan Chang, Kianté Brantley, Wen Sun

2602.19362 2026-02-22
TESTING

MentalBlackboard: Evaluating Spatial Visualization via Mathematical Transformations

Spatial visualization is the mental ability to imagine, transform, and manipulate the spatial characteristics of objects and actions. This intelligence is a part of human cognition where actions an...

Nilay Yilmaz, Maitreya Patel, Naga Sai Abhiram Kusumba, Yixuan He, Yezhou Yang

2602.19357 2026-02-22
TESTING

PerSoMed: A Large-Scale Balanced Dataset for Persian Social Media Text Classification

This research introduces the first large-scale, well-balanced Persian social media text classification dataset, specifically designed to address the lack of comprehensive resources in this domain. ...

Isun Chehreh, Ebrahim Ansari

2602.19333 2026-02-22
TESTING

Partial Soft-Matching Distance for Neural Representational Comparison with Partial Unit Correspondence

Representational similarity metrics typically force all units to be matched, making them susceptible to noise and outliers common in neural representations. We extend the soft-matching distance to ...

Chaitanya Kapoor, Alex H. Williams, Meenakshi Khosla

2602.19331 2026-02-22
TESTING

RetinaVision: XAI-Driven Augmented Regulation for Precise Retinal Disease Classification using deep learning framework

Early and accurate classification of retinal diseases is critical to counter vision loss and for guiding clinical management of retinal diseases. In this study, we proposed a deep learning method f...

Mohammad Tahmid Noor, Shayan Abrar, Jannatul Adan Mahi, Md Parvez Mia, Asaduzzaman Hridoy, Samant...

2602.19324 2026-02-22
TESTING

IPv2: An Improved Image Purification Strategy for Real-World Ultra-Low-Dose Lung CT Denoising

The image purification strategy constructs an intermediate distribution with aligned anatomical structures, which effectively corrects the spatial misalignment between real-world ultra-low-dose CT ...

Guoliang Gong, Man Yu

2602.19314 2026-02-22
TESTING

Learning partial transpose signatures in qubit ququart states from a few measurements

Higher-dimensional quantum systems are attracting interest for improving quantum protocol performance by increasing memory space. Characterizing quantum resources of such systems is fundamental but...

Christian Candeago, Paolo Da Rold, Michele Grossi, Pawel Horodecki, Antonio Mandarino

2602.19307 2026-02-22
TESTING

The Path to Conversational AI Tutors: Integrating Tutoring Best Practices and Targeted Technologies to Produce Scalable AI Agents

The emergence of generative AI has accelerated the development of conversational tutoring systems that interact with students through natural language dialogue. Unlike prior intelligent tutoring sy...

Kirk Vanacore, Ryan S. Baker, Avery H. Closser, Jeremy Roschelle

2602.19303 2026-02-22
TESTING

Time-Varying Hazard Patterns and Co-Mutation Profiles of KRAS G12C and G12D in Real-World NSCLC

Background: KRAS mutations are the largest oncogenic subset in NSCLC. While KRAS G12C is now targetable, no approved therapies exist for G12D. We examined time-to-next-treatment (TTNT) and overall ...

Robert Amevor, Dennis Baidoo, Emmanuel Kubuafor

2602.19295 2026-02-22
TESTING

Towards Automated Page Object Generation for Web Testing using Large Language Models

Page Objects (POs) are a widely adopted design pattern for improving the maintainability and scalability of automated end-to-end web tests. However, creating and maintaining POs is still largely a ...

Betül Karagöz, Filippo Ricca, Matteo Biagiola, Andrea Stocco

2602.19294 2026-02-22
TESTING

AdsorbFlow: energy-conditioned flow matching enables fast and realistic adsorbate placement

Identifying low-energy adsorption geometries on catalytic surfaces is a practical bottleneck for computational heterogeneous catalysis: the difficulty lies not only in the cost of density functiona...

Jiangjie Qiu, Wentao Li, Honghao Chen, Leyi Zhao, Xiaonan Wang

2602.19289 2026-02-22
TESTING

Limited Reasoning Space: The cage of long-horizon reasoning in LLMs

The test-time compute strategy, such as Chain-of-Thought (CoT), has significantly enhanced the ability of large language models to solve complex tasks like logical reasoning. However, empirical stu...

Zhenyu Li, Guanlin Wu, Cheems Wang, Yongqiang Zhao

2602.19281 2026-02-22
TESTING

DD-CAM: Minimal Sufficient Explanations for Vision Models Using Delta Debugging

We introduce a gradient-free framework for identifying minimal, sufficient, and decision-preserving explanations in vision models by isolating the smallest subset of representational units whose jo...

Krishna Khadka, Yu Lei, Raghu N. Kacker, D. Richard Kuhn

2602.19274 2026-02-22
AI LLM

VIRAASAT: Traversing Novel Paths for Indian Cultural Reasoning

Large Language Models (LLMs) have made significant progress in reasoning tasks across various domains such as mathematics and coding. However, their performance deteriorates in tasks requiring rich...

Harshul Raj Surana, Arijit Maji, Aryan Vats, Akash Ghosh, Sriparna Saha, Amit Sheth

2602.18429 2026-02-20
TESTING

CapNav: Benchmarking Vision Language Models on Capability-conditioned Indoor Navigation

Vision-Language Models (VLMs) have shown remarkable progress in Vision-Language Navigation (VLN), offering new possibilities for navigation decision-making that could benefit both robotic platforms...

Xia Su, Ruiqi Chen, Benlin Liu, Jingwei Ma, Zonglin Di, Ranjay Krishna, Jon Froehlich

2602.18424 2026-02-20
AI LLM

SPQ: An Ensemble Technique for Large Language Model Compression

This study presents an ensemble technique, SPQ (SVD-Pruning-Quantization), for large language model (LLM) compression that combines variance-retained singular value decomposition (SVD), activation-...

Jiamin Yao, Eren Gultepe

2602.18420 2026-02-20
AI LLM

AI-Wrapped: Participatory, Privacy-Preserving Measurement of Longitudinal LLM Use In-the-Wild

Alignment research on large language models (LLMs) increasingly depends on understanding how these systems are used in everyday contexts. yet naturalistic interaction data is difficult to access du...

Cathy Mengying Fang, Sheer Karny, Chayapatr Archiwaranguprok, Yasith Samaradivakara, Pat Pataranu...

2602.18415 2026-02-20
TESTING

An algebraic theory of Lojasiewicz exponents

We develop a unified algebraic and valuative theory of Lojasiewicz exponents for pairs of graded families and filtrations of ideals. Within this framework, local Lojasiewicz exponents, gradient exp...

Tai Huy Ha

2602.18410 2026-02-20
AI LLM

How Fast Can I Run My VLA? Demystifying VLA Inference Performance with VLA-Perf

Vision-Language-Action (VLA) models have recently demonstrated impressive capabilities across various embodied AI tasks. While deploying VLA models on real-world robots imposes strict real-time inf...

Wenqi Jiang, Jason Clemons, Karu Sankaralingam, Christos Kozyrakis

2602.18397 2026-02-20