Papers
Research papers from arXiv and related sources
GHOST: Fast Category-agnostic Hand-Object Interaction Reconstruction from RGB Videos using Gaussian Splatting
Understanding realistic hand-object interactions from monocular RGB videos is essential for AR/VR, robotics, and embodied AI. Existing methods rely on category-specific templates or heavy computati...
Ahmed Tawfik Aboukhadra, Marcel Rogge, Nadia Robertini, Abdalla Arafa, Jameel Malik, Ahmed Elhaye...
Progressive Training for Explainable Citation-Grounded Dialogue: Reducing Hallucination to Zero in English-Hindi LLMs
Knowledge-grounded dialogue systems aim to generate informative, contextually relevant responses by conditioning on external knowledge sources. However, most existing approaches focus exclusively o...
Vedant Pandya
Uniform a priori bounds and error analysis for the Adam stochastic gradient descent optimization method
The adaptive moment estimation (Adam) optimizer proposed by Kingma & Ba (2014) is presumably the most popular stochastic gradient descent (SGD) optimization method for the training of deep neural n...
Steffen Dereich, Thang Do, Arnulf Jentzen
Comparative Analysis of Large Language Models in Generating Telugu Responses for Maternal Health Queries
Large Language Models (LLMs) have been progressively exhibiting there capabilities in various areas of research. The performance of the LLMs in acute maternal healthcare area, predominantly in low ...
Anagani Bhanusree, Sai Divya Vissamsetty, K VenkataKrishna Rao, Rimjhim
Act While Thinking: Accelerating LLM Agents via Pattern-Aware Speculative Tool Execution
LLM-powered agents are emerging as a dominant paradigm for autonomous task solving. Unlike standard inference workloads, agents operate in a strictly serial "LLM-tool" loop, where the LLM must wait...
Yifan Sui, Han Zhao, Rui Ma, Zhiyuan He, Hao Wang, Jianxun Li, Yuqing Yang
Translating MRI to PET through Conditional Diffusion Models with Enhanced Pathology Awareness
Positron emission tomography (PET) is a widely recognized technique for diagnosing neurodegenerative diseases, offering critical functional insights. However, its high costs and radiation exposure ...
Yitong Li, Igor Yakushev, Dennis M. Hedderich, Christian Wachinger
From Accuracy to Readiness: Metrics and Benchmarks for Human-AI Decision-Making
Artificial intelligence (AI) systems are deployed as collaborators in human decision-making. Yet, evaluation practices focus primarily on model accuracy rather than whether human-AI teams are prepa...
Min Hun Lee
I Can't Believe It's Corrupt: Evaluating Corruption in Multi-Agent Governance Systems
Large language models are increasingly proposed as autonomous agents for high-stakes public workflows, yet we lack systematic evidence about whether they would follow institutional rules when grant...
Vedanta S P, Ponnurangam Kumaraguru
Quantitative Introspection in Language Models: Tracking Internal States Across Conversation
Tracking the internal states of large language models across conversations is important for safety, interpretability, and model welfare, yet current methods are limited. Linear probes and other whi...
Nicolas Martorell
PromptHub: Enhancing Multi-Prompt Visual In-Context Learning with Locality-Aware Fusion, Concentration and Alignment
Visual In-Context Learning (VICL) aims to complete vision tasks by imitating pixel demonstrations. Recent work pioneered prompt fusion that combines the advantages of various demonstrations, which ...
Tianci Luo, Jinpeng Wang, Shiyu Qin, Niu Lian, Yan Feng, Bin Chen, Chun Yuan, Shu-Tao Xia
Reasoning over mathematical objects: on-policy reward modeling and test time aggregation
The ability to precisely derive mathematical objects is a core requirement for downstream STEM applications, including mathematics, physics, and chemistry, where reasoning must culminate in formall...
Pranjal Aggarwal, Marjan Ghazvininejad, Seungone Kim, Ilia Kulikov, Jack Lanchantin, Xian Li, Tia...
Geography According to ChatGPT -- How Generative AI Represents and Reasons about Geography
Understanding how AI will represent and reason about geography should be a key concern for all of us, as the broader public increasingly interacts with spaces and places through these systems. Simi...
Krzysztof Janowicz, Gengchen Mai, Rui Zhu, Song Gao, Zhangyu Wang, Yingjie Hu, Lauren Bennett
A Human-in/on-the-Loop Framework for Accessible Text Generation
Plain Language and Easy-to-Read formats in text simplification are essential for cognitive accessibility. Yet current automatic simplification and evaluation pipelines remain largely automated, met...
Lourdes Moreno, Paloma Martínez
Bridging Crystal Structure and Material Properties via Bond-Centric Descriptors
Although chemical bonding is the fundamental mechanistic bridge connecting atomic structure to macroscopic material properties, current data-driven materials science largely treats it as an implici...
Jian-Feng Zhang, Ze-Feng Gao, Xiao-Qi Han, Bo Zhan, Dingshun Lv, Miao Gao, Kai Liu, Xinguo Ren, Z...
Probing the Color-Octet Mechanism via Dihadron Fragmentation in $χ_b$ Decays
The color-octet (CO) mechanism is a cornerstone of non-relativistic QCD, yet its long-distance matrix elements remain limited, preventing stringent tests of the theory. We demonstrate that the Artr...
Zhi-Guo He, Guanghui Li, Yu-Jie Tian, Xin-Kai Wen, Bin Yan
Evaluating LLM-Generated Lessons from the Language Learning Students' Perspective: A Short Case Study on Duolingo
Popular language learning applications such as Duolingo use large language models (LLMs) to generate lessons for its users. Most lessons focus on general real-world scenarios such as greetings, ord...
Carlos Rafael Catalan, Patricia Nicole Monderin, Lheane Marie Dizon, Gap Estrella, Raymund John S...
Bridging Network Fragmentation: A Semantic-Augmented DRL Framework for UAV-aided VANETs
Vehicular Ad-hoc Networks (VANETs) are the digital cornerstone of autonomous driving, yet they suffer from severe network fragmentation in urban environments due to physical obstructions. Unmanned ...
Gaoxiang Cao, Wenke Yuan, Huasen He, Yunpeng Hou, Xiaofeng Jiang, Shuangwu Chen, Jian Yang
Through the Looking-Glass: AI-Mediated Video Communication Reduces Interpersonal Trust and Confidence in Judgments
AI-based tools that mediate, enhance or generate parts of video communication may interfere with how people evaluate trustworthiness and credibility. In two preregistered online experiments (N = 2,...
Nelson Navajas Fernández, Jeffrey T. Hancock, Maurice Jakesch
Conflict-Based Search for Multi Agent Path Finding with Asynchronous Actions
Multi-Agent Path Finding (MAPF) seeks collision-free paths for multiple agents from their respective start locations to their respective goal locations while minimizing path costs. Most existing MA...
Xuemian Wu, Shizhe Zhao, Zhongqiang Ren
Matter radii from interaction cross sections using microscopic nuclear densities
Understanding how nuclear size evolves with the number of protons and neutrons tests our models of strongly interacting matter. The nuclear charge (and proton) radii accessible through electromagne...
A. J. Smith, K. Godbey, C. Hebborn, W. Nazarewicz, F. M. Nunes, P. -G. Reinhard