Research

Paper

TESTING March 16, 2026

A practical randomized trust-region method to escape saddle points in high dimension

Authors

Radu-Alexandru Dragomir, Xiaowen Jiang, Bonan Sun, Nicolas Boumal

Abstract

Without randomization, escaping the saddle points of $f \colon \mathbb{R}^d \to \mathbb{R}$ requires at least $Ω(d)$ pieces of information about $f$ (values, gradients, Hessian-vector products). With randomization, this can be reduced to a polylogarithmic dependence in $d$. The prototypical algorithm to that effect is perturbed gradient descent (PGD): through sustained jitter, it reliably escapes strict saddle points. However, it also never settles: there is no convergence. What is more, PGD requires precise tuning based on Lipschitz constants and a preset target accuracy. To improve on this, we modify the time-tested trust-region method with truncated conjugate gradients (TR-tCG). Specifically, we randomize the initialization of tCG (the subproblem solver), and we prove that tCG automatically amplifies the randomization near saddles (to escape) and absorbs it near local minimizers (to converge). Saddle escape happens over several iterations. Accordingly, our analysis is multi-step, with several novelties. The proposed algorithm is practical: it essentially tracks the good behavior of TR-tCG, with three minute modifications and a single new hyperparameter (the noise scale $σ$). We provide an implementation and numerical experiments.

Metadata

arXiv ID: 2603.15494

Provider: ARXIV

Primary Category: math.OC

Published: 2026-03-16

Fetched: 2026-03-17 06:02

Related papers

Fractal universe and quantum gravity made simple

Fabio Briscese, Gianluca Calcagni • 2026-03-25

POLY-SIM: Polyglot Speaker Identification with Missing Modality Grand Challenge 2026 Evaluation Plan

Marta Moscati, Muhammad Saad Saeed, Marina Zanoni, Mubashir Noman, Rohan Kuma... • 2026-03-25

LensWalk: Agentic Video Understanding by Planning How You See in Videos

Keliang Li, Yansong Li, Hongze Shen, Mengdi Liu, Hong Chang, Shiguang Shan • 2026-03-25

Orientation Reconstruction of Proteins using Coulomb Explosions

Tomas André, Alfredo Bellisario, Nicusor Timneanu, Carl Caleman • 2026-03-25

The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series

Jan Hemmerling, Marcel Schwieder, Philippe Rufin, Leon-Friedrich Thomas, Mire... • 2026-03-25

Raw Data (Debug)

{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.15494v1</id>\n    <title>A practical randomized trust-region method to escape saddle points in high dimension</title>\n    <updated>2026-03-16T16:20:27Z</updated>\n    <link href='https://arxiv.org/abs/2603.15494v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.15494v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Without randomization, escaping the saddle points of $f \\colon \\mathbb{R}^d \\to \\mathbb{R}$ requires at least $Ω(d)$ pieces of information about $f$ (values, gradients, Hessian-vector products). With randomization, this can be reduced to a polylogarithmic dependence in $d$. The prototypical algorithm to that effect is perturbed gradient descent (PGD): through sustained jitter, it reliably escapes strict saddle points. However, it also never settles: there is no convergence. What is more, PGD requires precise tuning based on Lipschitz constants and a preset target accuracy.\n  To improve on this, we modify the time-tested trust-region method with truncated conjugate gradients (TR-tCG). Specifically, we randomize the initialization of tCG (the subproblem solver), and we prove that tCG automatically amplifies the randomization near saddles (to escape) and absorbs it near local minimizers (to converge). Saddle escape happens over several iterations. Accordingly, our analysis is multi-step, with several novelties.\n  The proposed algorithm is practical: it essentially tracks the good behavior of TR-tCG, with three minute modifications and a single new hyperparameter (the noise scale $σ$). We provide an implementation and numerical experiments.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='math.OC'/>\n    <category scheme='http://arxiv.org/schemas/atom' term='math.NA'/>\n    <published>2026-03-16T16:20:27Z</published>\n    <arxiv:comment>52 pages + appendices (61 pages in total)</arxiv:comment>\n    <arxiv:primary_category term='math.OC'/>\n    <author>\n      <name>Radu-Alexandru Dragomir</name>\n    </author>\n    <author>\n      <name>Xiaowen Jiang</name>\n    </author>\n    <author>\n      <name>Bonan Sun</name>\n    </author>\n    <author>\n      <name>Nicolas Boumal</name>\n    </author>\n  </entry>"
}