Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Foster, Margaret
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.00022
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910006470770688
author	Foster, Margaret
author_facet	Foster, Margaret
contents	We propose a measurement framework for difficult-to-access contexts that uses indirect data traces, interpretable machine-learning models, and theory-guided triangulation to fill inaccessible measurement spaces. Many high-stakes systems of scientific and policy interest are difficult, if not impossible, to reach directly: dynamics of interest are unobservable, data are indirect and fragmented across sources, and ground truth is absent or concealed. In these settings, available data often do not support conventional strategies for analysis, such as statistical inference on a single authoritative data stream or model validation against labeled outcomes. To address this problem, we introduce a general framework for measurement in data regimes characterized by structurally missing or adversarial data. We propose combining multi-source triangulation with interpretable machine learning models. Rather than relying on accuracy against unobservable, unattainable ideal data, our framework seeks consistency across separate, partially informative models. This allows users to draw defensible conclusions about the state of the world based on cross-signal consistency or divergence from an expected state. Our framework provides an analytical workflow tailored to quantitative characterization in the absence of data sufficient for conventional statistical or causal inference. We demonstrate our approach and explicitly surface inferential limits through an empirical analysis of organizational growth and internal pressure dynamics in a clandestine militant organization, drawing on multiple observational signals that individually provide incomplete and biased views of the underlying process. The results show how triangulated, interpretable ML can recover substantively meaningful variation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_00022
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Measurement for Opaque Systems: Multi-source Triangulation with Interpretable Machine Learning Foster, Margaret Machine Learning We propose a measurement framework for difficult-to-access contexts that uses indirect data traces, interpretable machine-learning models, and theory-guided triangulation to fill inaccessible measurement spaces. Many high-stakes systems of scientific and policy interest are difficult, if not impossible, to reach directly: dynamics of interest are unobservable, data are indirect and fragmented across sources, and ground truth is absent or concealed. In these settings, available data often do not support conventional strategies for analysis, such as statistical inference on a single authoritative data stream or model validation against labeled outcomes. To address this problem, we introduce a general framework for measurement in data regimes characterized by structurally missing or adversarial data. We propose combining multi-source triangulation with interpretable machine learning models. Rather than relying on accuracy against unobservable, unattainable ideal data, our framework seeks consistency across separate, partially informative models. This allows users to draw defensible conclusions about the state of the world based on cross-signal consistency or divergence from an expected state. Our framework provides an analytical workflow tailored to quantitative characterization in the absence of data sufficient for conventional statistical or causal inference. We demonstrate our approach and explicitly surface inferential limits through an empirical analysis of organizational growth and internal pressure dynamics in a clandestine militant organization, drawing on multiple observational signals that individually provide incomplete and biased views of the underlying process. The results show how triangulated, interpretable ML can recover substantively meaningful variation.
title	Measurement for Opaque Systems: Multi-source Triangulation with Interpretable Machine Learning
topic	Machine Learning
url	https://arxiv.org/abs/2602.00022

Similar Items