Papers
Research papers from arXiv and related sources
MLLM-based Textual Explanations for Face Comparison
Multimodal Large Language Models (MLLMs) have recently been proposed as a means to generate natural-language explanations for face recognition decisions. While such explanations facilitate human in...
Redwan Sony, Anil K Jain, Ross Arun
Accelerating the Particle-In-Cell code ECsim with OpenACC
The Particle-In-Cell (PIC) method is a computational technique widely used in plasma physics to model plasmas at the kinetic level. In this work, we present our effort to prepare the semi-implicit ...
Elisabetta Boella, Nitin Shukla, Filippo Spiga, Mozhgan Kabiri Chimeh, Matt Bettencourt, Maria El...
Bio-inspired metaheuristic optimization for hierarchical architecture design of industrial control systems
Automated process control systems (APCS) are widely used in modern industrial enterprises. They address three key objectives: ensuring the required quality of manufactured products, ensuring proces...
Ruslan Zakirzyanov
CoEmpaTeam: Enhancing Cognitive Empathy using LLM-based Avatars and Dynamic Role Play in Virtual Reality
Cognitive empathy, the ability to understand others' perspectives, is essential for effective communication, reducing biases, and constructive negotiation. However, this skill is declining in a per...
Dehui Kong, Martin Feick, Shi Liu, Alexander Maedche
Retrieval-Augmented Sketch-Guided 3D Building Generation
In the early design stage of Japanese detached houses, the lack of a unified design representation among clients, sales representatives, and designers leads to design drift and inefficient feedback...
Zhengyang Wang, Nuttapong Rochanavibhata, Yuxiao Ren, Xusheng Du, Ye Zhang, Haoran Xie
Testing general relativity with binary black holes: a study on the sensitivity requirements for future space-based detectors
We study the sensitivity required for a future space-based detector to search for beyond general relativity effect in gravitational wave detection. To do this, we use the current design of TianQin,...
Tangchao Zhan, Changfu Shi, Shuo Sun, Jianwei Mei
CryoCMOS RF multiplexer for superconducting qubit control, readout and flux biasing at millikelvin temperatures with picowatt power consumption
Large-scale cryogenic quantum systems are constrained by an input-output bottleneck between room-temperature electronics and millikelvin stages, particularly in superconducting qubit platforms. Thi...
Liam Fallik, Sriram Balamurali, Alican Caglar, Rohith Acharya, Jacques Van Damme, Tsvetan Ivanov,...
Omnilingual SONAR: Cross-Lingual and Cross-Modal Sentence Embeddings Bridging Massively Multilingual Text and Speech
Cross-lingual sentence encoders typically cover only a few hundred languages and often trade downstream quality for stronger alignment, limiting their adoption. We introduce OmniSONAR, a new family...
Omnilingual SONAR Team, João Maria Janeiro, Pere-Lluís Huguet Cabot, Ioannis Tsiamas, Yen Meng, ...
Rationale Matters: Learning Transferable Rubrics via Proxy-Guided Critique for VLMReward Models
Generative reward models (GRMs) for vision-language models (VLMs) often evaluate outputs via a three-stage pipeline: rubric generation, criterion-based scoring, and a final verdict. However, the in...
Weijie Qiu, Dai Guan, Junxin Wang, Zhihang Li, Yongbo Gai, Mengyu Zhou, Erchao Zhao, Xiaoxi Jiang...
On the Transfer of Collinearity to Computer Vision
Collinearity is a visual perception phenomenon in the human brain that amplifies spatially aligned edges arranged along a straight line. However, it is vague for which purpose humans might have thi...
Frederik Beuth, Danny Kowerko
BATQuant: Outlier-resilient MXFP4 Quantization via Learnable Block-wise Optimization
Microscaling floating-point (MXFP) formats have emerged as a promising standard for deploying Multi-modal Large Language Models (MLLMs) and Large Language Models (LLMs) on modern accelerator archit...
Ji-Fu Li, Manyi Zhang, Xiaobo Xia, Han Bao, Haoli Bai, Zhenhua Dong, Xianzhi Yu
HistoAtlas: A Pan-Cancer Morphology Atlas Linking Histomics to Molecular Programs and Clinical Outcomes
We present HistoAtlas, a pan-cancer computational atlas that extracts 38 interpretable histomic features from 6,745 diagnostic H&E slides across 21 TCGA cancer types and systematically links every ...
Pierre-Antoine Bannier
Runtime Governance for AI Agents: Policies on Paths
AI agents -- systems that plan, reason, and act using large language models -- produce non-deterministic, path-dependent behavior that cannot be fully governed at design time, where with governed w...
Maurits Kaptein, Vassilis-Javed Khan, Andriy Podstavnychy
When and Why Does Unsupervised RL Succeed in Mathematical Reasoning? A Manifold Envelopment Perspective
Although outcome-based reinforcement learning (RL) significantly advances the mathematical reasoning capabilities of Large Language Models (LLMs), its reliance on computationally expensive ground-t...
Zelin Zhang, Fei Cheng, Chenhui Chu
REFORGE: Multi-modal Attacks Reveal Vulnerable Concept Unlearning in Image Generation Models
Recent progress in image generation models (IGMs) enables high-fidelity content creation but also amplifies risks, including the reproduction of copyrighted content and the generation of offensive ...
Yong Zou, Haoran Li, Fanxiao Li, Shenyang Wei, Yunyun Dong, Li Tang, Wei Zhou, Renyang Liu
Malicious Or Not: Adding Repository Context to Agent Skill Classification
Agent skills extend local AI agents, such as Claude Code or Open Claw, with additional functionality, and their popularity has led to the emergence of dedicated skill marketplaces, similar to app s...
Florian Holzbauer, David Schmidt, Gabriel Gegenhuber, Sebastian Schrittwieser, Johanna Ullrich
Splitting horizontal and vertical polynomial order in a compatible finite element discretisation for numerical weather prediction
The accurate and efficient representation of atmospheric dynamics remains a central challenge in numerical weather prediction. A particular difficulty arises from the strong anisotropy of the atmos...
Daniel Witt, Thomas Bendall, Jemma Shipton
Characterizing Delusional Spirals through Human-LLM Chat Logs
As large language models (LLMs) have proliferated, disturbing anecdotal reports of negative psychological effects, such as delusions, self-harm, and ``AI psychosis,'' have emerged in global media a...
Jared Moore, Ashish Mehta, William Agnew, Jacy Reese Anthis, Ryan Louie, Yifan Mai, Peggy Yin, My...
VideoMatGen: PBR Materials through Joint Generative Modeling
We present a method for generating physically-based materials for 3D shapes based on a video diffusion transformer architecture. Our method is conditioned on input geometry and a text description, ...
Jon Hasselgren, Zheng Zeng, Milos Hasan, Jacob Munkberg
Deep Learning-Driven Black-Box Doherty Power Amplifier with Pixelated Output Combiner and Extended Efficiency Range
This article presents a deep learning-driven inverse design methodology for Doherty power amplifiers (PA) with multi-port pixelated output combiner networks. A deep convolutional neural network (CN...
Han Zhou, Haojie Chang, David Widen