Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lin, Victoria, Xu, Xinnuo, Lawrence, Rachel, Ueno, Risa, Sharma, Amit, Gonzalez, Javier, Prasad, Niranjani
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2602.16787
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908840380858368
author	Lin, Victoria Xu, Xinnuo Lawrence, Rachel Ueno, Risa Sharma, Amit Gonzalez, Javier Prasad, Niranjani
author_facet	Lin, Victoria Xu, Xinnuo Lawrence, Rachel Ueno, Risa Sharma, Amit Gonzalez, Javier Prasad, Niranjani
contents	Despite their strong performance on reasoning benchmarks, large language models (LLMs) have proven brittle when presented with counterfactual questions, suggesting weaknesses in their causal reasoning ability. While recent work has demonstrated that labeled counterfactual tasks can be useful benchmarks of LLMs' causal reasoning, producing such data at the scale required to cover the vast potential space of counterfactuals is limited. In this work, we introduce double counterfactual consistency (DCC), a lightweight inference-time method for measuring and guiding the ability of LLMs to reason causally. Without requiring labeled counterfactual data, DCC verifies a model's ability to execute two important elements of causal reasoning: causal intervention and counterfactual prediction. Using DCC, we evaluate the causal reasoning abilities of various leading LLMs across a range of reasoning tasks and interventions. Moreover, we demonstrate the effectiveness of DCC as a training-free test-time rejection sampling criterion and show that it can directly improve performance on reasoning tasks across multiple model families.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_16787
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Better Think Thrice: Learning to Reason Causally with Double Counterfactual Consistency Lin, Victoria Xu, Xinnuo Lawrence, Rachel Ueno, Risa Sharma, Amit Gonzalez, Javier Prasad, Niranjani Machine Learning Computation and Language Despite their strong performance on reasoning benchmarks, large language models (LLMs) have proven brittle when presented with counterfactual questions, suggesting weaknesses in their causal reasoning ability. While recent work has demonstrated that labeled counterfactual tasks can be useful benchmarks of LLMs' causal reasoning, producing such data at the scale required to cover the vast potential space of counterfactuals is limited. In this work, we introduce double counterfactual consistency (DCC), a lightweight inference-time method for measuring and guiding the ability of LLMs to reason causally. Without requiring labeled counterfactual data, DCC verifies a model's ability to execute two important elements of causal reasoning: causal intervention and counterfactual prediction. Using DCC, we evaluate the causal reasoning abilities of various leading LLMs across a range of reasoning tasks and interventions. Moreover, we demonstrate the effectiveness of DCC as a training-free test-time rejection sampling criterion and show that it can directly improve performance on reasoning tasks across multiple model families.
title	Better Think Thrice: Learning to Reason Causally with Double Counterfactual Consistency
topic	Machine Learning Computation and Language
url	https://arxiv.org/abs/2602.16787

Similar Items