Research

Paper

AI LLM March 06, 2026

Evaluating the Predictability of Selected Weather Extremes with Aurora, an AI Weather Forecast Model

Authors

Qin Huang, Moyan Liu, Yeongbin Kwon, Upmanu Lall

Abstract

AI weather foundation models now achieve forecast skill comparable to numerical weather prediction at far lower computational cost, yet their predictability for high-impact extremes across dynamical regimes remains uncertain. We evaluate Aurora using an event-based framework spanning tropical cyclones, freezes, heatwaves, atmospheric rivers, and extreme precipitation at lead times from 1 to 21 days. Aurora demonstrates strong short-range (1-7 day) skill across event types, including competitive tropical cyclone track accuracy and high spatial agreement for temperature and moisture extremes. However, a consistent subseasonal failure mode emerges: while large-scale circulation patterns remain moderately skillful at 14-21 day leads, threshold-based extreme intensity collapses as fields regress toward climatology. This divergence indicates that Aurora retains synoptic-scale dynamical structure but loses surface-impact amplitude beyond 7-10 days. The practical predictability horizon for deterministic AI extreme-event forecasting therefore remains constrained by intrinsic atmospheric dynamics.

Metadata

arXiv ID: 2603.06516
Provider: ARXIV
Primary Category: physics.ao-ph
Published: 2026-03-06
Fetched: 2026-03-09 06:05

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2603.06516v1</id>\n    <title>Evaluating the Predictability of Selected Weather Extremes with Aurora, an AI Weather Forecast Model</title>\n    <updated>2026-03-06T17:55:08Z</updated>\n    <link href='https://arxiv.org/abs/2603.06516v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2603.06516v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>AI weather foundation models now achieve forecast skill comparable to numerical weather prediction at far lower computational cost, yet their predictability for high-impact extremes across dynamical regimes remains uncertain. We evaluate Aurora using an event-based framework spanning tropical cyclones, freezes, heatwaves, atmospheric rivers, and extreme precipitation at lead times from 1 to 21 days. Aurora demonstrates strong short-range (1-7 day) skill across event types, including competitive tropical cyclone track accuracy and high spatial agreement for temperature and moisture extremes. However, a consistent subseasonal failure mode emerges: while large-scale circulation patterns remain moderately skillful at 14-21 day leads, threshold-based extreme intensity collapses as fields regress toward climatology. This divergence indicates that Aurora retains synoptic-scale dynamical structure but loses surface-impact amplitude beyond 7-10 days. The practical predictability horizon for deterministic AI extreme-event forecasting therefore remains constrained by intrinsic atmospheric dynamics.</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='physics.ao-ph'/>\n    <published>2026-03-06T17:55:08Z</published>\n    <arxiv:primary_category term='physics.ao-ph'/>\n    <author>\n      <name>Qin Huang</name>\n    </author>\n    <author>\n      <name>Moyan Liu</name>\n    </author>\n    <author>\n      <name>Yeongbin Kwon</name>\n    </author>\n    <author>\n      <name>Upmanu Lall</name>\n    </author>\n  </entry>"
}