Personal Assistant Web

AI LLM

Predicting Conflict Impact on Performance in O-RAN

The O-RAN Alliance promotes the integration of intelligent autonomous agents to control the Radio Access Network (RAN). This improves flexibility, performance, and observability in the RAN, but int...

Pietro Brach del Prever, Niloofar Mohamadi, Salvatore D'Oro, Leonardo Bonati, Michele Polese, Łuk...

2603.08685 • 2026-03-09

View PDF

AI LLM

A New Lower Bound for the Random Offerer Mechanism in Bilateral Trade using AI-Guided Evolutionary Search

The celebrated Myerson--Satterthwaite theorem shows that in bilateral trade, no mechanism can be simultaneously fully efficient, Bayesian incentive compatible (BIC), and budget balanced (BB). This ...

Yang Cai, Vineet Gupta, Zun Li, Aranyak Mehta

2603.08679 • 2026-03-09

View PDF

TESTING

Exp-Force: Experience-Conditioned Pre-Grasp Force Selection with Vision-Language Models

Accurate pre-contact grasp force selection is critical for safe and reliable robotic manipulation. Adaptive controllers regulate force after contact but still require a reasonable initial estimate....

Siqi Shang, Minchao Huang, Bill Fan, Lillian Chin

2603.08668 • 2026-03-09

View PDF

AI LLM

Cybersecurity AI: Hacking Consumer Robots in the AI Era

Is robot cybersecurity broken by AI? Consumer robots -- from autonomous lawnmowers to powered exoskeletons and window cleaners -- are rapidly entering homes and workplaces, yet their security remai...

Víctor Mayoral-Vilches, Unai Ayucar-Carbajo, Olivier Laflamme, Ruikai Peng, María Sanz-Gómez, Fra...

2603.08665 • 2026-03-09

View PDF

AI LLM

How Far Can Unsupervised RLVR Scale LLM Training?

Unsupervised reinforcement learning with verifiable rewards (URLVR) offers a pathway to scale LLM training beyond the supervision bottleneck by deriving rewards without ground truth labels. Recent ...

Bingxiang He, Yuxin Zuo, Zeyuan Liu, Shangziqi Zhao, Zixuan Fu, Junlin Yang, Cheng Qian, Kaiyan Z...

2603.08660 • 2026-03-09

View PDF

TESTING

Context-free Self-Conditioned GAN for Trajectory Forecasting

In this paper, we present a context-free unsupervised approach based on a self-conditioned GAN to learn different modes from 2D trajectories. Our intuition is that each mode indicates a different b...

Tiago Rodrigues de Almeida, Eduardo Gutierrez Maestro, Oscar Martinez Mozos

2603.08658 • 2026-03-09

View PDF

AI LLM

OfficeQA Pro: An Enterprise Benchmark for End-to-End Grounded Reasoning

We introduce OfficeQA Pro, a benchmark for evaluating AI agents on grounded, multi-document reasoning over a large and heterogeneous document corpus. The corpus consists of U.S. Treasury Bulletins ...

Krista Opsahl-Ong, Arnav Singhvi, Jasmine Collins, Ivan Zhou, Cindy Wang, Ashutosh Baheti, Owen O...

2603.08655 • 2026-03-09

View PDF

AI LLM

CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

Recent advancements in Unified Multimodal Models (UMMs) have significantly advanced text-to-image (T2I) generation, particularly through the integration of Chain-of-Thought (CoT) reasoning. However...

Haodong Li, Chunmei Qing, Huanyu Zhang, Dongzhi Jiang, Yihang Zou, Hongbo Peng, Dingming Li, Yuho...

2603.08652 • 2026-03-09

View PDF

TESTING

Divide and Predict: An Architecture for Input Space Partitioning and Enhanced Accuracy

In this article the authors develop an intrinsic measure for quantifying heterogeneity in training data for supervised learning. This measure is the variance of a random variable which factors thro...

Fenix W. Huang, Henning S. Mortveit, Christian M. Reidys

2603.08649 • 2026-03-09

View PDF

AI LLM

PostTrainBench: Can LLM Agents Automate LLM Post-Training?

AI agents have become surprisingly proficient at software engineering over the past year, largely due to improvements in reasoning capabilities. This raises a deeper question: can these systems ext...

Ben Rank, Hardik Bhatnagar, Ameya Prabhu, Shira Eisenberg, Karina Nguyen, Matthias Bethge, Maksym...

2603.08640 • 2026-03-09

View PDF

TESTING

UNBOX: Unveiling Black-box visual models with Natural-language

Ensuring trustworthiness in open-world visual recognition requires models that are interpretable, fair, and robust to distribution shifts. Yet modern vision systems are increasingly deployed as pro...

Simone Carnemolla, Chiara Russo, Simone Palazzo, Quentin Bouniot, Daniela Giordano, Zeynep Akata,...

2603.08639 • 2026-03-09

View PDF

AI LLM

Reachability-based Temporal Logic Verification for Reliable LLM-guided Human-Autonomy Teaming

We propose a reachability-based framework for reliable LLM-guided human-autonomy teaming (HAT) using signal temporal logic (STL). In the proposed framework, LLM is leveraged as a translator that tr...

Joonwon Choi, Kartik Anand Pant, Karthik Nune, Inseok Hwang

2603.08633 • 2026-03-09

View PDF

TESTING

Secondary gravitational waves against a strong gravitational wave in the Bianchi VI universe

A proper-time method for constructing models of dynamic gravitational-wave fields is presented. Using the proper-time method, analytical (not numerical) models of secondary gravitational waves are ...

Konstantin E. Osetrin

2603.08628 • 2026-03-09

View PDF

AI LLM

Coverage-Guided Multi-Agent Harness Generation for Java Library Fuzzing

Coverage-guided fuzzing has proven effective for software testing, but targeting library code requires specialized fuzz harnesses that translate fuzzer-generated inputs into valid API invocations. ...

Nils Loose, Nico Winkel, Kristoffer Hempel, Felix Mächtle, Julian Hans, Thomas Eisenbarth

2603.08616 • 2026-03-09

View PDF

TESTING

Query-Guided Analysis and Mitigation of Data Verification Errors (Extended Version)

Data verification, the process of labeling data items as correct or incorrect, is a preprocessing step that may critically affect the quality of results in data-driven pipelines. Despite recent adv...

Ran Schreiber, Yael Amsterdamer

2603.08612 • 2026-03-09

View PDF

TESTING

RESAPLE: An Approximate One-Step Restricted Likelihood Estimator of Spatial Dependence for Exploratory Spatial Analysis

Diagnostics such as Moran's index and approximate profile likelihood-based estimators (APLE) for Gaussian spatial autoregressive models are widely used in exploratory data analysis to assess the st...

Aditya Khan, Meredith Franklin

2603.08607 • 2026-03-09

View PDF

AI LLM

What to Make Sense of in the Era of LLM? A Perspective from the Structure and Efforts in Sensemaking

Sensemaking tasks often entail navigating through complex, ambiguous data to construct coherent insights. Prior work has shown that crowds can effectively distribute cognitive load, pooling diverse...

Tianyi Li, Satya Samhita Bonepalli, Vikram Mohanty

2603.08604 • 2026-03-09

View PDF

TESTING

Bilevel Planning with Learned Symbolic Abstractions from Interaction Data

Intelligent agents must reason over both continuous dynamics and discrete representations to generate effective plans in complex environments. Previous studies have shown that symbolic abstractions...

Fatih Dogangun, Burcu Kilic, Serdar Bahar, Emre Ugur

2603.08599 • 2026-03-09

View PDF

TESTING

The Grasshopper Problem on the Sphere

The spherical grasshopper problem is a geometric optimization problem that arises in the context of Bell inequalities and can be interpreted as identifying the best local hidden variable approximat...

David Llamas, Dmitry Chistikov, Adrian Kent, Mike Paterson, Olga Goulko

2603.08579 • 2026-03-09

View PDF

TESTING

Drift-to-Action Controllers: Budgeted Interventions with Online Risk Certificates

Deployed machine learning systems face distribution drift, yet most monitoring pipelines stop at alarms and leave the response underspecified under labeling, compute, and latency constraints. We in...

Ismail Lamaakal, Chaymae Yahyati, Khalid El Makkaoui, Ibrahim Ouahbi, Yassine Maleh

2603.08578 • 2026-03-09

View PDF

Papers