Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.14719 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866910056652472320 |
|---|---|
| author | Faltenbacher, Sofia Wahl, Jonas Herman, Rebecca Runge, Jakob |
| author_facet | Faltenbacher, Sofia Wahl, Jonas Herman, Rebecca Runge, Jakob |
| contents | Causal discovery methods based on the PC algorithm are proven to be sound if all structural assumptions are fulfilled and all conditional independence tests are correct. This idealized setting is rarely given in real data. In this work, we first analyze how local errors can propagate throughout the output graph of a PC-based method, highlighting how consequential seemingly innocuous errors can become. Next, we introduce coherency scores to find assumption violations and small sample errors in the absence of a ground truth. These scores do not require statistical tests beyond those already executed by the causal discovery algorithm. Errors detected by our approach extend the set of errors that can be detected by comparable existing methods. We place our computationally cheap global error detection and quantification scores as a bridge between computationally expensive global answer-set-programming-based methods and less expensive local error detection methods. The scores are analyzed on simulated and real-world datasets. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2502_14719 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | How PC-based Methods Err: Towards Better Reporting of Assumption Violations and Small Sample Errors Faltenbacher, Sofia Wahl, Jonas Herman, Rebecca Runge, Jakob Machine Learning Causal discovery methods based on the PC algorithm are proven to be sound if all structural assumptions are fulfilled and all conditional independence tests are correct. This idealized setting is rarely given in real data. In this work, we first analyze how local errors can propagate throughout the output graph of a PC-based method, highlighting how consequential seemingly innocuous errors can become. Next, we introduce coherency scores to find assumption violations and small sample errors in the absence of a ground truth. These scores do not require statistical tests beyond those already executed by the causal discovery algorithm. Errors detected by our approach extend the set of errors that can be detected by comparable existing methods. We place our computationally cheap global error detection and quantification scores as a bridge between computationally expensive global answer-set-programming-based methods and less expensive local error detection methods. The scores are analyzed on simulated and real-world datasets. |
| title | How PC-based Methods Err: Towards Better Reporting of Assumption Violations and Small Sample Errors |
| topic | Machine Learning |
| url | https://arxiv.org/abs/2502.14719 |