Research

Papers

Research papers from arXiv and related sources

Total: 4694 AI/LLM: 2583 Testing: 2111
AI LLM

Theory Discovery in Social Networks: Automating ERGM Specification with Large Language Models

Understanding how social networks form, whether through reciprocity, shared attributes, or triadic closure, is central to computational social science. Exponential Random Graph Models (ERGMs) offer...

Yidan Sun, Mayank Kejriwal

2603.04306 2026-03-04
TESTING

$V_1$: Unifying Generation and Self-Verification for Parallel Reasoners

Test-time scaling for complex reasoning tasks shows that leveraging inference-time compute, by methods such as independently sampling and aggregating multiple solutions, results in significantly be...

Harman Singh, Xiuyu Li, Kusha Sareen, Monishwaran Maheswaran, Sijun Tan, Xiaoxia Wu, Junxiong Wan...

2603.04304 2026-03-04
TESTING

Compliant In-hand Rolling Manipulation Using Tactile Sensing

We investigate in-hand rolling manipulation using a multifingered robot hand, where each finger is compliant and equipped with a tactile fingertip providing contact location and wrench information....

Huan Weng, Yifei Chen, Kevin M. Lynch

2603.04301 2026-03-04
AI LLM

The Company You Keep: How LLMs Respond to Dark Triad Traits

Large Language Models (LLMs) often exhibit highly agreeable and reinforcing conversational styles, also known as AI-sycophancy. Although this behavior is encouraged, it may become problematic when ...

Zeyi Lu, Angelica Henestrosa, Pavel Chizhov, Ivan P. Yamshchikov

2603.04299 2026-03-04
AI LLM

LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

The advancement of Machine learning (ML), Large Audio Language Models (LALMs), and autonomous AI agents in Music Information Retrieval (MIR) necessitates a shift from static tagging to rich, human-...

Ioannis Prokopiou, Ioannis Sina, Agisilaos Kounelis, Pantelis Vikatos, Themos Stafylakis

2603.04293 2026-03-04
AI LLM

Position: Vector Prompt Interfaces Should Be Exposed to Enable Customization of Large Language Models

As large language models (LLMs) transition from research prototypes to real-world systems, customization has emerged as a central bottleneck. While text prompts can already customize LLM behavior, ...

Liangwei Yang, Shiyu Wang, Haolin Chen, Rithesh Murthy, Ming Zhu, Jielin Qiu, Zixiang Chen, Junta...

2603.04292 2026-03-04
TESTING

The effect of chemical vapor infiltration process parameters on flexural strength of porous α-SiC: A numerical model

The flexural strength variability of α-SiC based ceramics at elevated temperatures creates the need for an Integrated Computational Materials Engineering (ICME) framework that relates the strength ...

Joseph J. Marziale, Jason Sun, Eric A. Walker, Yu Chen, David Salac, James Chen

2603.04287 2026-03-04
AI LLM

VANGUARD: Vehicle-Anchored Ground Sample Distance Estimation for UAVs in GPS-Denied Environments

Autonomous aerial robots operating in GPS-denied or communication-degraded environments frequently lose access to camera metadata and telemetry, leaving onboard perception systems unable to recover...

Yifei Chen, Xupeng Chen, Feng Wang, Niangang Jiao, Jiayin Liu

2603.04277 2026-03-04
AI LLM

Causality Elicitation from Large Language Models

Large language models (LLMs) are trained on enormous amounts of data and encode knowledge in their parameters. We propose a pipeline to elicit causal relationships from LLMs. Specifically, (i) we s...

Takashi Kameyama, Masahiro Kato, Yasuko Hio, Yasushi Takano, Naoto Minakawa

2603.04276 2026-03-04
TESTING

Statistical Inference for Score Decompositions

We introduce inference methods for score decompositions, which partition scoring functions for predictive assessment into three interpretable components: miscalibration, discrimination, and uncerta...

Timo Dimitriadis, Marius Puke

2603.04275 2026-03-04
TESTING

Grid-agnostic volume of fluid approach with interface sharpening and surface tension for compressible multiphase flows

The interfacial diffusion associated with finite volume method (FVM) discretizations of multiphase flows creates the need for an interface sharpening mechanism. Such solutions for structured quadri...

J. Marziale, J. Sun, D. Salac, J. Chen

2603.04270 2026-03-04
AI LLM

ViterbiPlanNet: Injecting Procedural Knowledge via Differentiable Viterbi for Planning in Instructional Videos

Procedural planning aims to predict a sequence of actions that transforms an initial visual state into a desired goal, a fundamental ability for intelligent agents operating in complex environments...

Luigi Seminara, Davide Moltisanti, Antonino Furnari

2603.04265 2026-03-04
TESTING

Atmospheric neutrino constraints on Lorentz invariance violation with the first six detection units of KM3NeT/ORCA

Lorentz invariance is a fundamental symmetry underlying both the Standard Model of particle physics and General Relativity. Testing its validity provides a direct means of searching for new physics...

KM3NeT Collaboration, O. Adriani, A. Albert, A. R. Alhebsi, S. Alshalloudi, S. Alves Garre, F. A...

2603.04264 2026-03-04
AI LLM

When AI Fails, What Works? A Data-Driven Taxonomy of Real-World AI Risk Mitigation Strategies

Large language models (LLMs) are increasingly embedded in high-stakes workflows, where failures propagate beyond isolated model errors into systemic breakdowns that can lead to legal exposure, repu...

Evgenija Popchanovska, Ana Gjorgjevikj, Maryan Rizinski, Lubomir Chitkushev, Irena Vodenska, Dimi...

2603.04259 2026-03-04
AI LLM

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Large language model (LLM) agents are fundamentally bottlenecked by finite context windows on long-horizon tasks. As trajectories grow, retaining tool outputs and intermediate reasoning in-context ...

Zhenting Wang, Huancheng Chen, Jiayun Wang, Wei Wei

2603.04257 2026-03-04
TESTING

Learning Read-Once Determinants and the Principal Minor Assignment Problem

A symbolic determinant under rank-one restriction computes a polynomial of the form $\det(A_0+A_1y_1+\ldots+A_ny_n)$, where $A_0,A_1,\ldots,A_n$ are square matrices over a field $\mathbb{F}$ and $r...

Abhiram Aravind, Abhranil Chatterjee, Sumanta Ghosh, Rohit Gurjar, Roshan Raj, Chandan Saha

2603.04255 2026-03-04
TESTING

Cluster-Level Experiments using Temporal Switchback Designs: Precision Gains in Pricing A/B Tests at LATAM Airlines

Experimentation is central to modern digital businesses, but many operational decisions cannot be randomized at the user level. In such cases, cluster-level experiments, where clusters are usually ...

Nicolás Ferrari-Ortiz, Sebastián Orellana-Montini, Timur Abbiasov, Marie Garkavenko, Rutger Lit

2603.04252 2026-03-04
TESTING

Predicting oscillations in complex networks with delayed feedback

Oscillatory dynamics are common features of complex networks, often playing essential roles in regulating function. Across scales from gene regulatory networks to ecosystems, delayed feedback mecha...

Shijie Liu, Jinliang Han, Tim Rogers, Yongzheng Sun

2603.04251 2026-03-04
AI LLM

LikeThis! Empowering App Users to Submit UI Improvement Suggestions Instead of Complaints

User feedback is crucial for the evolution of mobile apps. However, research suggests that users tend to submit uninformative, vague, or destructive feedback. Unlike recent AI4SE approaches that fo...

Jialiang Wei, Ali Ebrahimi Pourasad, Walid Maalej

2603.04245 2026-03-04
AI LLM

Agentics 2.0: Logical Transduction Algebra for Agentic Data Workflows

Agentic AI is rapidly transitioning from research prototypes to enterprise deployments, where requirements extend to meet the software quality attributes of reliability, scalability, and observabil...

Alfio Massimiliano Gliozzo, Junkyu Lee, Nahuel Defosse

2603.04241 2026-03-04