Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lee, Kevin, Spiewak, Russell, Walsh, James
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence Solar and Stellar Astrophysics Machine Learning Space Physics
Online Access:	https://arxiv.org/abs/2511.20694
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917258614276096
author	Lee, Kevin Spiewak, Russell Walsh, James
author_facet	Lee, Kevin Spiewak, Russell Walsh, James
contents	Scientific reasoning through Large Language Models in heliophysics involves more than just recalling facts: it requires incorporating physical assumptions, maintaining consistent units, and providing clear scientific formats through coordinated approaches. To address these challenges, we present Reasoning With a Star, a newly contributed heliophysics dataset applicable to reasoning; we also provide an initial benchmarking approach. Our data are constructed from National Aeronautics and Space Administration & University Corporation for Atmospheric Research Living With a Star summer school problem sets and compiled into a readily consumable question-and-answer structure with question contexts, reasoning steps, expected answer type, ground-truth targets, format hints, and metadata. A programmatic grader checks the predictions using unit-aware numerical tolerance, symbolic equivalence, and schema validation. We benchmark a single-shot baseline and four multi-agent patterns, finding that decomposing workflows through systems engineering principles outperforms direct prompting on problems requiring deductive reasoning rather than pure inductive recall.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_20694
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Reasoning With a Star: A Heliophysics Dataset and Benchmark for Agentic Scientific Reasoning Lee, Kevin Spiewak, Russell Walsh, James Artificial Intelligence Solar and Stellar Astrophysics Machine Learning Space Physics Scientific reasoning through Large Language Models in heliophysics involves more than just recalling facts: it requires incorporating physical assumptions, maintaining consistent units, and providing clear scientific formats through coordinated approaches. To address these challenges, we present Reasoning With a Star, a newly contributed heliophysics dataset applicable to reasoning; we also provide an initial benchmarking approach. Our data are constructed from National Aeronautics and Space Administration & University Corporation for Atmospheric Research Living With a Star summer school problem sets and compiled into a readily consumable question-and-answer structure with question contexts, reasoning steps, expected answer type, ground-truth targets, format hints, and metadata. A programmatic grader checks the predictions using unit-aware numerical tolerance, symbolic equivalence, and schema validation. We benchmark a single-shot baseline and four multi-agent patterns, finding that decomposing workflows through systems engineering principles outperforms direct prompting on problems requiring deductive reasoning rather than pure inductive recall.
title	Reasoning With a Star: A Heliophysics Dataset and Benchmark for Agentic Scientific Reasoning
topic	Artificial Intelligence Solar and Stellar Astrophysics Machine Learning Space Physics
url	https://arxiv.org/abs/2511.20694

Similar Items