ClimateSet

Current State

  • Emulator Code: Up and Running (code for scientists)
  • Dataset: Subset of the climate models available on hugging face
  • Dataset Extension Pipeline: Currently rebuild for scalability and usability (package release expected end of 2024)
  • Documentation & User friendliness: In Progress

Brief Introduction

Foundational models for climate and weather forecasting achieved major breakthroughs in the last years and are on the rise. Most of those models rely on past observational weather data (ERA5). If we want to be able to predict not only our medium-range weather we must find a way to include climate dynamics in our models or data. One way to address this issue is, to learn not only from past data distributions, but also train our ML models on different future climate scenarios. With ClimateSet we present a data-pipeline that enables ML practicioners to retrieve climate model data (CMIP6) across multiple resolutions, climate models, clime model ensemble members, and climatic variables. We showcased ClimateSet on a climate projection task. We hope that ClimateSet will make it either for ML practicioners to perform climate projection and train models - for different tasks - on a more generalized dataset.