Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lin, Jerry, Hu, Zeyuan, Beucler, Tom, Frields, Katherine, Christensen, Hannah, Hannah, Walter, Heuer, Helge, Ukkonnen, Peter, Mansfield, Laura A., Zheng, Tian, Peng, Liran, Gupta, Ritwik, Gentine, Pierre, Al-Naher, Yusef, Duan, Mingjiang, Hattori, Kyo, Ji, Weiliang, Li, Chunhan, Matsuda, Kippei, Murakami, Naoki, Ron, Shlomo, Serlin, Marec, Song, Hongjian, Tanabe, Yuma, Yamamoto, Daisuke, Zhou, Jianyao, Pritchard, Mike
Format:	Preprint
Published:	2025
Subjects:	Atmospheric and Oceanic Physics Machine Learning
Online Access:	https://arxiv.org/abs/2511.20963
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910045647667200
author	Lin, Jerry Hu, Zeyuan Beucler, Tom Frields, Katherine Christensen, Hannah Hannah, Walter Heuer, Helge Ukkonnen, Peter Mansfield, Laura A. Zheng, Tian Peng, Liran Gupta, Ritwik Gentine, Pierre Al-Naher, Yusef Duan, Mingjiang Hattori, Kyo Ji, Weiliang Li, Chunhan Matsuda, Kippei Murakami, Naoki Ron, Shlomo Serlin, Marec Song, Hongjian Tanabe, Yuma Yamamoto, Daisuke Zhou, Jianyao Pritchard, Mike
author_facet	Lin, Jerry Hu, Zeyuan Beucler, Tom Frields, Katherine Christensen, Hannah Hannah, Walter Heuer, Helge Ukkonnen, Peter Mansfield, Laura A. Zheng, Tian Peng, Liran Gupta, Ritwik Gentine, Pierre Al-Naher, Yusef Duan, Mingjiang Hattori, Kyo Ji, Weiliang Li, Chunhan Matsuda, Kippei Murakami, Naoki Ron, Shlomo Serlin, Marec Song, Hongjian Tanabe, Yuma Yamamoto, Daisuke Zhou, Jianyao Pritchard, Mike
contents	Subgrid machine-learning (ML) parameterizations have the potential to introduce a new generation of climate models that incorporate the effects of higher-resolution physics without incurring the prohibitive computational cost associated with more explicit physics-based simulations. However, important issues, ranging from online instability to inconsistent online performance, have limited their operational use for long-term climate projections. To more rapidly drive progress in solving these issues, domain scientists and machine learning researchers opened up the offline aspect of this problem to the broader machine learning and data science community with the release of ClimSim, a NeurIPS Datasets and Benchmarks publication, and an associated Kaggle competition. This paper reports on the downstream results of the Kaggle competition by coupling emulators inspired by the winning teams' architectures to an interactive climate model (including full cloud microphysics, a regime historically prone to online instability) and systematically evaluating their online performance. Our results demonstrate that online stability in the low-resolution, real-geography setting is reproducible across multiple diverse architectures, which we consider a key milestone. All tested architectures exhibit strikingly similar offline and online biases, though their responses to architecture-agnostic design choices (e.g., expanding the list of input variables) can differ significantly. Multiple Kaggle-inspired architectures achieve state-of-the-art (SOTA) results on certain metrics such as zonal mean bias patterns and global RMSE, indicating that crowdsourcing the essence of the offline problem is one path to improving online performance in hybrid physics-AI climate simulation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_20963
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Crowdsourcing the Frontier: Advancing Hybrid Physics-ML Climate Simulation via a $50,000 Kaggle Competition Lin, Jerry Hu, Zeyuan Beucler, Tom Frields, Katherine Christensen, Hannah Hannah, Walter Heuer, Helge Ukkonnen, Peter Mansfield, Laura A. Zheng, Tian Peng, Liran Gupta, Ritwik Gentine, Pierre Al-Naher, Yusef Duan, Mingjiang Hattori, Kyo Ji, Weiliang Li, Chunhan Matsuda, Kippei Murakami, Naoki Ron, Shlomo Serlin, Marec Song, Hongjian Tanabe, Yuma Yamamoto, Daisuke Zhou, Jianyao Pritchard, Mike Atmospheric and Oceanic Physics Machine Learning Subgrid machine-learning (ML) parameterizations have the potential to introduce a new generation of climate models that incorporate the effects of higher-resolution physics without incurring the prohibitive computational cost associated with more explicit physics-based simulations. However, important issues, ranging from online instability to inconsistent online performance, have limited their operational use for long-term climate projections. To more rapidly drive progress in solving these issues, domain scientists and machine learning researchers opened up the offline aspect of this problem to the broader machine learning and data science community with the release of ClimSim, a NeurIPS Datasets and Benchmarks publication, and an associated Kaggle competition. This paper reports on the downstream results of the Kaggle competition by coupling emulators inspired by the winning teams' architectures to an interactive climate model (including full cloud microphysics, a regime historically prone to online instability) and systematically evaluating their online performance. Our results demonstrate that online stability in the low-resolution, real-geography setting is reproducible across multiple diverse architectures, which we consider a key milestone. All tested architectures exhibit strikingly similar offline and online biases, though their responses to architecture-agnostic design choices (e.g., expanding the list of input variables) can differ significantly. Multiple Kaggle-inspired architectures achieve state-of-the-art (SOTA) results on certain metrics such as zonal mean bias patterns and global RMSE, indicating that crowdsourcing the essence of the offline problem is one path to improving online performance in hybrid physics-AI climate simulation.
title	Crowdsourcing the Frontier: Advancing Hybrid Physics-ML Climate Simulation via a $50,000 Kaggle Competition
topic	Atmospheric and Oceanic Physics Machine Learning
url	https://arxiv.org/abs/2511.20963

Similar Items