Saved in:
Bibliographic Details
Main Authors: Faltenbacher, Sofia, Wahl, Jonas, Herman, Rebecca, Runge, Jakob
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.14719
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910056652472320
author Faltenbacher, Sofia
Wahl, Jonas
Herman, Rebecca
Runge, Jakob
author_facet Faltenbacher, Sofia
Wahl, Jonas
Herman, Rebecca
Runge, Jakob
contents Causal discovery methods based on the PC algorithm are proven to be sound if all structural assumptions are fulfilled and all conditional independence tests are correct. This idealized setting is rarely given in real data. In this work, we first analyze how local errors can propagate throughout the output graph of a PC-based method, highlighting how consequential seemingly innocuous errors can become. Next, we introduce coherency scores to find assumption violations and small sample errors in the absence of a ground truth. These scores do not require statistical tests beyond those already executed by the causal discovery algorithm. Errors detected by our approach extend the set of errors that can be detected by comparable existing methods. We place our computationally cheap global error detection and quantification scores as a bridge between computationally expensive global answer-set-programming-based methods and less expensive local error detection methods. The scores are analyzed on simulated and real-world datasets.
format Preprint
id arxiv_https___arxiv_org_abs_2502_14719
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle How PC-based Methods Err: Towards Better Reporting of Assumption Violations and Small Sample Errors
Faltenbacher, Sofia
Wahl, Jonas
Herman, Rebecca
Runge, Jakob
Machine Learning
Causal discovery methods based on the PC algorithm are proven to be sound if all structural assumptions are fulfilled and all conditional independence tests are correct. This idealized setting is rarely given in real data. In this work, we first analyze how local errors can propagate throughout the output graph of a PC-based method, highlighting how consequential seemingly innocuous errors can become. Next, we introduce coherency scores to find assumption violations and small sample errors in the absence of a ground truth. These scores do not require statistical tests beyond those already executed by the causal discovery algorithm. Errors detected by our approach extend the set of errors that can be detected by comparable existing methods. We place our computationally cheap global error detection and quantification scores as a bridge between computationally expensive global answer-set-programming-based methods and less expensive local error detection methods. The scores are analyzed on simulated and real-world datasets.
title How PC-based Methods Err: Towards Better Reporting of Assumption Violations and Small Sample Errors
topic Machine Learning
url https://arxiv.org/abs/2502.14719