Climate in a Bottle: Towards a Generative Foundation Model for the Kilometer-Scale Global Atmosphere
Climate in a Bottle: Towards a Generative Foundation Model for the Kilometer-Scale Global Atmosphere(By Noah D. Brenowitz, Tao Ge, Akshay Subramaniam, Peter Manshausen, Aayush Gupta, David M. Hall, Morteza Mardani, Arash Vahdat, Karthik Kashinath, Michael S. Pritchard, NVIDIA* The paper introduces **Climate in a Bottle (cBottle)**, a generative diffusion-based AI framework capable of synthesizing full global atmospheric states at an unprecedented $\mathbf{5 \text{ km resolution}}$ (over 12.5 million pixels per sample). Unlike prevailing auto-regressive paradigms, cBottle samples directly from the full distribution of atmospheric states without requiring a previous time step, thereby avoiding issues like drifts and instabilities inherent to time-stepping models.* cBottle utilizes a **two-stage cascaded diffusion approach**: a global coarse-resolution generator conditioned on minimal climate-controlling inputs (such as monthly sea surface temperature and solar position), followed by a patch-based 16x super-resolution module.* The model demonstrates **foundational versatility** by being trained jointly on multiple data modalities, including ERA5 reanalysis and ICON global cloud-resolving simulations. This enables various zero-shot applications such as climate downscaling, channel infilling for missing or corrupted variables, bias correction between datasets, and translation between these modalities.* cBottle proposes a new form of **interactive climate modeling** through the use of guided diffusion. By training a classifier alongside the generator, users can steer the model to conditionally generate physically plausible **extreme weather events, such as Tropical Cyclones**, at specified locations on demand, circumventing the need to sift through petabytes of output to find rare events.* The model exhibits **high climate faithfulness** across a battery of tests, including reproducing diurnal-to-seasonal scale variability, large-scale modes of variability (like the Northern Annular Mode), and tropical cyclone statistics. Furthermore, it achieves **extreme distillation** by encapsulating massive datasets into a few GB of neural network weights, offering a 256x compression ratio per channel.