Papers
Research papers from arXiv and related sources
Fine-tuning RoBERTa for CVE-to-CWE Classification: A 125M Parameter Model Competitive with LLMs
We present a fine-tuned RoBERTa-base classifier (125M parameters) for mapping Common Vulnerabilities and Exposures (CVE) descriptions to Common Weakness Enumeration (CWE) categories. We construct a...
Nikita Mosievskiy
Machine learning for sustainable geoenergy: uncertainty, physics and decision-ready inference
Geoenergy projects (CO2 storage, geothermal, subsurface H2 generation/storage, critical minerals from subsurface fluids, or nuclear waste disposal) increasingly follow a petroleum-style funnel from...
Hannah P. Menke, Ahmed H. Elsheikh, Lingli Wei, Nanzhe Wang, Andreas Busch
T-DAQ-P: a portable tablet-form multi-stream data acquisition and contextual telemetry platform based on COTS modules and a custom integration layer
We present T-DAQ-P, a compact and portable data acquisition and telemetry platform designed to support detector deployments in laboratory and field conditions by integrating event streaming, slow-c...
D. Tagnani, M. Andreotti
BiTro: Bidirectional Transfer Learning Enhances Bulk and Spatial Transcriptomics Prediction in Cancer Pathological Images
Cancer pathological analysis requires modeling tumor heterogeneity across multiple modalities, primarily through transcriptomics and whole slide imaging (WSI), along with their spatial relations. O...
Jingkun Yu, Guangkai Shang, Changtao Li, Xun Gong, Tianrui Li, Yazhou He, Zhipeng Luo
LLMs as Signal Detectors: Sensitivity, Bias, and the Temperature-Criterion Analogy
Large language models (LLMs) are evaluated for calibration using metrics such as Expected Calibration Error that conflate two distinct components: the model's ability to discriminate correct from i...
Jon-Paul Cacioli
Demonstration of AI-Assisted Scientific Workflow on Canonical Benchmarks
We present a fully reproducible demonstration of an AI-assisted scientific workflow designed for a broad physics, mathematics, and computer-science readership. The initial project artifact stack wa...
Kin Hung Fung
LLMind: Bio-inspired Training-free Adaptive Visual Representations for Vision-Language Models
Vision-Language Models (VLMs) typically assume a uniform spatial fidelity across the entire field of view of visual inputs, dedicating equal precision to even the uninformative regions. By contrast...
Soumyaratna Debnath, Bui Duc Manh, Zinan Liu, Lin Wang
Improved Degree Bounds for Hyperbolicity of Surfaces and Curve Complements
This paper establishes new degree bounds for Kobayashi hyperbolicity in dimension two. Our main results are: - A very generic surface in $\mathbb{P}^3$ of degree at least $17$ is Kobayashi hyperb...
Lei Hou, Dinh Tuan Huynh, Joël Merker, Song-Yan Xie
IgPose: A Generative Data-Augmented Pipeline for Robust Immunoglobulin-Antigen Binding Prediction
Predicting immunoglobulin-antigen (Ig-Ag) binding remains a significant challenge due to the paucity of experimentally-resolved complexes and the limited accuracy of de novo Ig structure prediction...
Tien-Cuong Bui, Injae Chung, Wonjun Lee, Junsu Ko, Juyong Lee
Video Detector: A Dual-Phase Vision-Based System for Real-Time Traffic Intersection Control and Intelligent Transportation Analysis
Urban traffic management increasingly requires intelligent sensing systems capable of adapting to dynamic traffic conditions without costly infrastructure modifications. Vision-based vehicle detect...
Mustafa Fatih Şen, Halûk Gümüşkaya, Şenol Pazar
vPET-ABC: Fast Voxelwise Approximate Bayesian Inference for Kinetic Modeling in PET
Dynamic PET kinetic modeling increasingly demands voxelwise uncertainty quantification and robust model selection. Yet total-body PET (TB-PET) data volumes make conventional Bayesian approaches, su...
Qinlin Gu, Gaelle M. Emvalomenos, Evan D. Morris, Clara Grazian, Steven R. Meikle
PCodeTrans: Translate Decompiled Pseudocode to Compilable and Executable Equivalent
Decompilation is foundational to binary analysis, yet conventional tools prioritize human readability over strict recompilability and verifiable runtime correctness. While recent LLM-based approach...
Yuxin Cui, Zeyu Gao, Shuxian He, Siliang Qin, Chao Zhang
Counterexample Guided Branching via Directional Relaxation Analysis in Complete Neural Network Verification
Deep Neural Networks demonstrate exceptional performance but remain vulnerable to adversarial perturbations, necessitating formal verification for safety-critical deployment. To address the computa...
Jingyang Li, Fu Song, Guoqiang Li
SimCert: Probabilistic Certification for Behavioral Similarity in Deep Neural Network Compression
Deploying Deep Neural Networks (DNNs) on resource-constrained embedded systems requires aggressive model compression techniques like quantization and pruning. However, ensuring that the compressed ...
Jingyang Li, Fu Song, Guoqiang Li
Universe Routing: Why Self-Evolving Agents Need Epistemic Control
A critical failure mode of current lifelong agents is not lack of knowledge, but the inability to decide how to reason. When an agent encounters "Is this coin fair?" it must recognize whether to in...
Zhaohui Geoffrey Wang
Face-to-Face: A Video Dataset for Multi-Person Interaction Modeling
Modeling the reactive tempo of human conversation remains difficult because most audio-visual datasets portray isolated speakers delivering short monologues. We introduce \textbf{Face-to-Face with ...
Ernie Chu, Vishal M. Patel
CORAL: COntextual Reasoning And Local Planning in A Hierarchical VLM Framework for Underwater Monitoring
Oyster reefs are critical ecosystem species that sustain biodiversity, filter water, and protect coastlines, yet they continue to decline globally. Restoring these ecosystems requires regular under...
Zhenqi Wu, Yuanjie Lu, Xuesu Xiao, Xiaomin Lin
$p^2$RAG: Privacy-Preserving RAG Service Supporting Arbitrary Top-$k$ Retrieval
Retrieval-Augmented Generation (RAG) enables large language models to use external knowledge, but outsourcing the RAG service raises privacy concerns for both data owners and users. Privacy-preserv...
Yulong Ming, Mingyue Wang, Jijia Yang, Cong Wang, Xiaohua Jia
Investigating the Impact of Speech Enhancement on Audio Deepfake Detection in Noisy Environments
Logical Access (LA) attacks, also known as audio deepfake attacks, use Text-to-Speech (TTS) or Voice Conversion (VC) methods to generate spoofed speech data. This can represent a serious threat to ...
Anacin, Angela, Shruti Kshirsagar, Anderson R. Avila
Online Learning for Supervisory Switching Control
We study supervisory switching control for partially-observed linear dynamical systems. The objective is to identify and deploy the best controller for the unknown system by periodically selecting ...
Haoyuan Sun, Ali Jadbabaie