Personal Assistant Web

TESTING

Testing the cosmic distance-duality relation with localized fast radio bursts: a cosmological model-independent study

We test the Etherington cosmic distance-duality relation (CDDR), by comparing Type Ia supernova (SNIa) luminosity-distance information from the Pantheon+ compilation with an angular-diameter-distan...

Jéferson A. S. Fortunato, Surajit Kalita, Amanda Weltman

2602.16869 • 2026-02-18

View PDF

TESTING

On the Tightness of the Second-Order Cone Relaxation of the Optimal Power Flow with Angles Recovery in Meshed Networks

This letter investigates properties of the second-order cone relaxation of the optimal power flow (OPF) problem, with emphasis on relaxation tightness, nodal voltage angles recovery, and alternatin...

Ginevra Larroux, Matthieu Jacobs, Mario Paolone

2602.16866 • 2026-02-18

View PDF

TESTING

SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation

The ability to manipulate tools significantly expands the set of tasks a robot can perform. Yet, tool manipulation represents a challenging class of dexterity, requiring grasping thin objects, in-h...

Kushal Kedia, Tyler Ga Wei Lum, Jeannette Bohg, C. Karen Liu

2602.16863 • 2026-02-18

View PDF

TESTING

Asteroidal activity amongst meteor datasets: Confirmed new "rock-comet" stream and search for a tidal disruption signature

Asteroid activity (e.g., thermo-mechanical breakdown, impacts, rotational shedding, tidal disruption, etc.) can inject meteoroids into near-Earth space and leave detectable signatures in orbit cata...

Patrick M. Shober

2602.16845 • 2026-02-18

View PDF

TESTING

Overseeing Agents Without Constant Oversight: Challenges and Opportunities

To enable human oversight, agentic AI systems often provide a trace of reasoning and action steps. Designing traces to have an informative, but not overwhelming, level of detail remains a critical ...

Madeleine Grunde-McLaughlin, Hussein Mozannar, Maya Murad, Jingya Chen, Saleema Amershi, Adam Fou...

2602.16844 • 2026-02-18

View PDF

TESTING

New Physics and Symmetry Tests with Polarized Photon Fusion and Dipole Moments

We discuss new-physics searches and symmetry tests with dipole moments, emphasizing the role of polarization observables. As a primary benchmark, we consider polarized photon fusion in the $e^+ e^-...

Fang Xu

2602.16834 • 2026-02-18

View PDF

TESTING

IndicJR: A Judge-Free Benchmark of Jailbreak Robustness in South Asian Languages

Safety alignment of large language models (LLMs) is mostly evaluated in English and contract-bound, leaving multilingual vulnerabilities understudied. We introduce \textbf{Indic Jailbreak Robustnes...

Priyaranjan Pattnayak, Sanchari Chowdhuri

2602.16832 • 2026-02-18

View PDF

TESTING

Learning under noisy supervision is governed by a feedback-truth gap

When feedback is absorbed faster than task structure can be evaluated, the learner will favor feedback over truth. A two-timescale model shows this feedback-truth gap is inevitable whenever the two...

Elan Schonfeld, Elias Wisnia

2602.16829 • 2026-02-18

View PDF

TESTING

Formal Mechanistic Interpretability: Automated Circuit Discovery with Provable Guarantees

*Automated circuit discovery* is a central tool in mechanistic interpretability for identifying the internal components of neural networks responsible for specific behaviors. While prior methods ha...

Itamar Hadad, Guy Katz, Shahaf Bassan

2602.16823 • 2026-02-18

View PDF

TESTING

Hybrid-Gym: Training Coding Agents to Generalize Across Tasks

When assessing the quality of coding agents, predominant benchmarks focus on solving single issues on GitHub, such as SWE-Bench. In contrast, in real use, these agents solve more various and comple...

Yiqing Xie, Emmy Liu, Gaokai Zhang, Nachiket Kotalwar, Shubham Gandhi, Sathwik Acharya, Xingyao W...

2602.16819 • 2026-02-18

View PDF

Papers