Papers
Research papers from arXiv and related sources
Seeing Farther and Smarter: Value-Guided Multi-Path Reflection for VLM Policy Optimization
Solving complex, long-horizon robotic manipulation tasks requires a deep understanding of physical interactions, reasoning about their long-term consequences, and precise high-level planning. Visio...
Yanting Yang, Shenyuan Gao, Qingwen Bu, Li Chen, Dimitris N. Metaxas
LLMs Can Learn to Reason Via Off-Policy RL
Reinforcement learning (RL) approaches for Large Language Models (LLMs) frequently use on-policy algorithms, such as PPO or GRPO. However, policy lag from distributed training architectures and dif...
Daniel Ritter, Owen Oertell, Bradley Guo, Jonathan Chang, Kianté Brantley, Wen Sun
MentalBlackboard: Evaluating Spatial Visualization via Mathematical Transformations
Spatial visualization is the mental ability to imagine, transform, and manipulate the spatial characteristics of objects and actions. This intelligence is a part of human cognition where actions an...
Nilay Yilmaz, Maitreya Patel, Naga Sai Abhiram Kusumba, Yixuan He, Yezhou Yang
PerSoMed: A Large-Scale Balanced Dataset for Persian Social Media Text Classification
This research introduces the first large-scale, well-balanced Persian social media text classification dataset, specifically designed to address the lack of comprehensive resources in this domain. ...
Isun Chehreh, Ebrahim Ansari
Partial Soft-Matching Distance for Neural Representational Comparison with Partial Unit Correspondence
Representational similarity metrics typically force all units to be matched, making them susceptible to noise and outliers common in neural representations. We extend the soft-matching distance to ...
Chaitanya Kapoor, Alex H. Williams, Meenakshi Khosla
RetinaVision: XAI-Driven Augmented Regulation for Precise Retinal Disease Classification using deep learning framework
Early and accurate classification of retinal diseases is critical to counter vision loss and for guiding clinical management of retinal diseases. In this study, we proposed a deep learning method f...
Mohammad Tahmid Noor, Shayan Abrar, Jannatul Adan Mahi, Md Parvez Mia, Asaduzzaman Hridoy, Samant...
IPv2: An Improved Image Purification Strategy for Real-World Ultra-Low-Dose Lung CT Denoising
The image purification strategy constructs an intermediate distribution with aligned anatomical structures, which effectively corrects the spatial misalignment between real-world ultra-low-dose CT ...
Guoliang Gong, Man Yu
Learning partial transpose signatures in qubit ququart states from a few measurements
Higher-dimensional quantum systems are attracting interest for improving quantum protocol performance by increasing memory space. Characterizing quantum resources of such systems is fundamental but...
Christian Candeago, Paolo Da Rold, Michele Grossi, Pawel Horodecki, Antonio Mandarino
The Path to Conversational AI Tutors: Integrating Tutoring Best Practices and Targeted Technologies to Produce Scalable AI Agents
The emergence of generative AI has accelerated the development of conversational tutoring systems that interact with students through natural language dialogue. Unlike prior intelligent tutoring sy...
Kirk Vanacore, Ryan S. Baker, Avery H. Closser, Jeremy Roschelle
Time-Varying Hazard Patterns and Co-Mutation Profiles of KRAS G12C and G12D in Real-World NSCLC
Background: KRAS mutations are the largest oncogenic subset in NSCLC. While KRAS G12C is now targetable, no approved therapies exist for G12D. We examined time-to-next-treatment (TTNT) and overall ...
Robert Amevor, Dennis Baidoo, Emmanuel Kubuafor
Towards Automated Page Object Generation for Web Testing using Large Language Models
Page Objects (POs) are a widely adopted design pattern for improving the maintainability and scalability of automated end-to-end web tests. However, creating and maintaining POs is still largely a ...
Betül Karagöz, Filippo Ricca, Matteo Biagiola, Andrea Stocco
AdsorbFlow: energy-conditioned flow matching enables fast and realistic adsorbate placement
Identifying low-energy adsorption geometries on catalytic surfaces is a practical bottleneck for computational heterogeneous catalysis: the difficulty lies not only in the cost of density functiona...
Jiangjie Qiu, Wentao Li, Honghao Chen, Leyi Zhao, Xiaonan Wang
Limited Reasoning Space: The cage of long-horizon reasoning in LLMs
The test-time compute strategy, such as Chain-of-Thought (CoT), has significantly enhanced the ability of large language models to solve complex tasks like logical reasoning. However, empirical stu...
Zhenyu Li, Guanlin Wu, Cheems Wang, Yongqiang Zhao
DD-CAM: Minimal Sufficient Explanations for Vision Models Using Delta Debugging
We introduce a gradient-free framework for identifying minimal, sufficient, and decision-preserving explanations in vision models by isolating the smallest subset of representational units whose jo...
Krishna Khadka, Yu Lei, Raghu N. Kacker, D. Richard Kuhn
VIRAASAT: Traversing Novel Paths for Indian Cultural Reasoning
Large Language Models (LLMs) have made significant progress in reasoning tasks across various domains such as mathematics and coding. However, their performance deteriorates in tasks requiring rich...
Harshul Raj Surana, Arijit Maji, Aryan Vats, Akash Ghosh, Sriparna Saha, Amit Sheth
CapNav: Benchmarking Vision Language Models on Capability-conditioned Indoor Navigation
Vision-Language Models (VLMs) have shown remarkable progress in Vision-Language Navigation (VLN), offering new possibilities for navigation decision-making that could benefit both robotic platforms...
Xia Su, Ruiqi Chen, Benlin Liu, Jingwei Ma, Zonglin Di, Ranjay Krishna, Jon Froehlich
SPQ: An Ensemble Technique for Large Language Model Compression
This study presents an ensemble technique, SPQ (SVD-Pruning-Quantization), for large language model (LLM) compression that combines variance-retained singular value decomposition (SVD), activation-...
Jiamin Yao, Eren Gultepe
AI-Wrapped: Participatory, Privacy-Preserving Measurement of Longitudinal LLM Use In-the-Wild
Alignment research on large language models (LLMs) increasingly depends on understanding how these systems are used in everyday contexts. yet naturalistic interaction data is difficult to access du...
Cathy Mengying Fang, Sheer Karny, Chayapatr Archiwaranguprok, Yasith Samaradivakara, Pat Pataranu...
An algebraic theory of Lojasiewicz exponents
We develop a unified algebraic and valuative theory of Lojasiewicz exponents for pairs of graded families and filtrations of ideals. Within this framework, local Lojasiewicz exponents, gradient exp...
Tai Huy Ha
How Fast Can I Run My VLA? Demystifying VLA Inference Performance with VLA-Perf
Vision-Language-Action (VLA) models have recently demonstrated impressive capabilities across various embodied AI tasks. While deploying VLA models on real-world robots imposes strict real-time inf...
Wenqi Jiang, Jason Clemons, Karu Sankaralingam, Christos Kozyrakis