Powered by RND
PodcastsCienciasEarthly Machine Learning

Earthly Machine Learning

Amirpasha
Earthly Machine Learning
Último episodio

Episodios disponibles

5 de 38
  • Jigsaw: Training Multi-Billion-Parameter AI Weather Models With Optimized Model Parallelism
    Jigsaw: Training Multi-Billion-Parameter AI Weather Models With Optimized Model ParallelismAuthors: Deifilia Kieckhefen, Markus Götz, Lars H. Heyen, Achim Streit, and Charlotte Debus (Karlsruhe Institute of Technology, Helmholtz AI)The paper introduces WeatherMixer (WM), a multi-layer perceptron (MLP)-based architecture designed for atmospheric forecasting, which serves as a competitive alternative to Transformer-based models. WM's workload scales linearly with input size, addressing the scaling challenges and quadratic computational complexity associated with the self-attention mechanism in Transformers when dealing with gigabyte-sized atmospheric data.• A novel parallelization scheme called Jigsaw parallelism is proposed, combining both domain parallelism and tensor parallelism to efficiently train multi-billion-parameter models. Jigsaw is optimized for large input data by fully sharding the data, model parameters, and optimizer states across devices, eliminating memory redundancy. Jigsaw effectively mitigates hardware bottlenecks, particularly I/O-bandwidth limitations frequently encountered in training large scientific AI models. Due to its partitioned data loading (domain parallelism), the scheme achieves superscalar weak scaling in I/O-bandwidth-limited systems. The method demonstrates excellent scaling behavior on high-performance computing systems, exceeding state-of-the-art performance in strong scaling in computation–communication-limited systems. The training was successfully scaled up to 256 GPUs, reaching peak performances of 9 and 11 PFLOPs.• Beyond hardware efficiency, Jigsaw improves predictive performance: by partitioning the model across more GPUs (model parallelism) instead of relying solely on data parallelism, it naturally enforces smaller global batch sizes, which empirically helps mitigate the problematic large-batch effects observed in AI weather models, leading to lower loss values.
    --------  
    13:36
  • XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledge
    XiChen: An observation-scalable fully AI-driven global weather forecasting system with 4D variational knowledgeAuthors: Wuxin Wang, Weicheng Ni, Lilan Huang, Tao Han, Ben Fei, Shuo Ma, Taikang Yuan, Yanlai Zhao, Kefeng Deng, Xiaoyong Li, Boheng Duan, Lei Bai, Kaijun RenXiChen is the first observation-scalable fully AI-driven global weather forecasting system. Its entire pipeline, from Data Assimilation (DA) to 10-day medium-range forecasting, can be accomplished within only 17 seconds using a single A100 GPU. This speed represents an acceleration exceeding 400-fold compared to the computational time required by operational Numerical Weather Prediction (NWP) systems. The system is architected upon a foundation model that is initially pre-trained for weather forecasting and subsequently fine-tuned to function as both observation operators and DA models. Crucially, the integration of four-dimensional variational (4DVar) knowledge ensures that XiChen’s DA and medium-range forecasting accuracy rivals that of operational NWP systems. XiChen demonstrates high scalability and robustness by employing a cascaded sequential DA framework to effectively assimilate both conventional observations (GDAS prepbufr) and raw satellite observations (AMSU-A and MHS). This design allows for the future integration of new observations simply by fine-tuning the respective observation operators and DA model components, which is critical for operational deployment. In terms of performance, XiChen achieves a skillful weather forecasting lead time exceeding 8.25 days (with ACC of Z500 > 0.6). This result is comparable to the Global Forecasting System (GFS) and substantially surpasses the performance of other end-to-end AI-based global weather forecasting systems, such as Aardvark (less than 8 days) and GraphDOP (about 5 days). A dual DA framework is implemented to operationalize XiChen as a continuous forecasting system. This framework utilizes separate 12-hour and 3-hour Data Assimilation Windows (DAW) to circumvent the multi-hour latency characteristic of high-resolution systems (like IFS HRES), thereby enabling the real-time acquisition of medium-range forecast products.
    --------  
    16:08
  • FuXi Weather : A data-to-forecast machine learning system for global weather
    A data-to-forecast machine learning system for global weather Xiuyu Sun et al. (2025). A data-to-forecast machine learning system for global weather. Nature Communications, https://doi.org/10.1038/s41467-025-62024-1• FuXi Weather is introduced as a groundbreaking end-to-end machine learning system for global weather forecasting. It autonomously performs data assimilation and forecasting in a 6-hour cycle, directly processing raw multi-satellite observations, and notably, it is the first such system to demonstrate continuous cycling operation over a full one-year period.• The system exhibits superior forecast accuracy in observation-sparse regions, outperforming traditional high-resolution forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF HRES) beyond day one in areas like central Africa and northern South America, despite utilizing substantially fewer observations.• Globally, FuXi Weather delivers comparable 10-day forecast performance to ECMWF HRES, generating reliable forecasts at a 0.25° resolution and extending the skillful lead times for a number of key meteorological variables.• FuXi Weather offers a cost-effective and physically consistent alternative to traditional Numerical Weather Prediction (NWP) systems. Its computational efficiency and reduced complexity are valuable for improving operational forecasts and enhancing climate resilience in regions with limited land-based observational infrastructure.• This development challenges the prevailing view that standalone machine learning-based weather forecasting systems are not viable for operational use, demonstrating a significant step forward in the application of AI to real-world weather prediction.
    --------  
    13:52
  • Probabilistic Emulation of a Global Climate Model with Spherical DYffusion
    Probabilistic Emulation of a Global Climate Model with Spherical DYffusion Salva Rühling Cachay, Brian Henn, Oliver Watt-Meyer, Christopher S. Bretherton, Rose Yu• This paper introduces Spherical DYffusion, the first conditional generative model designed for the probabilistic emulation of a global climate model. It achieves accurate and physically consistent global climate ensemble simulations by combining the dynamics-informed diffusion framework (DYffusion) with the Spherical Fourier Neural Operator (SFNO) architecture.• The model demonstrates significant improvements in climate model emulation, achieving near gold-standard performance. It substantially reduces climate biases compared to existing baselines, with errors often closer to the reference simulation’s noise floor. For example, it reduces climate biases to within 50% of the reference model, outperforming the next best baseline (ACE) by more than 2x.• Spherical DYffusion enables stable and efficient long-term climate simulations, capable of 100-year simulations at 6-hourly timesteps with low computational overhead. It offers significant speed-ups (over 25x) and energy savings compared to the physics-based FV3GFS model it emulates.• The method is particularly effective for ensemble climate simulations, accurately reproducing climate variability consistent with the reference model and further reducing climate biases through ensemble-averaging. The paper also highlights that short-term weather performance does not necessarily translate to accurate long-term climate statistics.
    --------  
    19:53
  • FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale
    FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale Boris Bonev, Thorsten Kurth, Ankur Mahesh, Mauro Bisson, Jean Kossaifi, Karthik Kashinath, Anima Anandkumar, William D. Collins, Michael S. Pritchard, and Alexander Keller• FourCastNet 3 (FCN3) introduces a pioneering geometric machine learning approach for probabilistic ensemble weather forecasting. It is designed to respect spherical geometry and accurately model the spatially correlated probabilistic nature of weather, resulting in stable spectra and realistic dynamics across multiple scales. The architecture is a purely convolutional neural network tailored for spherical geometry.• Achieves superior forecasting accuracy and speed, surpassing leading conventional ensemble models and rivaling the best diffusion-based ML methods. FCN3 produces forecasts 8 to 60 times faster than these approaches; for instance, a 60-day global forecast at 0.25°, 6-hourly resolution is generated in under 4 minutes on a single GPU.• Demonstrates exceptional physical fidelity and long-term stability, maintaining excellent probabilistic calibration and realistic spectra even at extended lead times of up to 60 days. This crucial achievement mitigates issues like blurring and the build-up of small-scale noise, which challenge other machine learning models, paving the way for physically faithful data-driven probabilistic weather models.• Enables scalable and efficient operations through a novel training paradigm that combines model- and data-parallelism, allowing large-scale training on 1024 GPUs and more. All key components, including training and inference code, are fully open-source, providing transparent and reproducible tools for meteorological forecasting and atmospheric science research.
    --------  
    17:39

Más podcasts de Ciencias

Acerca de Earthly Machine Learning

“Earthly Machine Learning (EML)” offers AI-generated insights into cutting-edge machine learning research in weather and climate sciences. Powered by Google NotebookLM, each episode distils the essence of a standout paper, helping you decide if it’s worth a deeper look. Stay updated on the ML innovations shaping our understanding of Earth. It may contain hallucinations.
Sitio web del podcast

Escucha Earthly Machine Learning, StarTalk Radio y muchos más podcasts de todo el mundo con la aplicación de radio.net

Descarga la app gratuita: radio.net

  • Añadir radios y podcasts a favoritos
  • Transmisión por Wi-Fi y Bluetooth
  • Carplay & Android Auto compatible
  • Muchas otras funciones de la app
Aplicaciones
Redes sociales
v7.23.11 | © 2007-2025 radio.de GmbH
Generated: 11/3/2025 - 9:59:58 AM