Saved in:
Bibliographic Details
Main Authors: Huang, Shimeng, Robinson, Matthew, Locatello, Francesco
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.19782
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914344103575552
author Huang, Shimeng
Robinson, Matthew
Locatello, Francesco
author_facet Huang, Shimeng
Robinson, Matthew
Locatello, Francesco
contents Mendelian Randomization (MR) is a prominent observational epidemiological research method designed to address unobserved confounding when estimating causal effects. However, core assumptions -- particularly the independence between instruments and unobserved confounders -- are often violated due to population stratification or assortative mating. Leveraging the increasing availability of multi-environment data, we propose a representation learning framework that exploits cross-environment invariance to recover latent exogenous components of genetic instruments. We provide theoretical guarantees for identifying these latent instruments under various mixing mechanisms and demonstrate the effectiveness of our approach through simulations and semi-synthetic experiments using data from the All of Us Research Hub.
format Preprint
id arxiv_https___arxiv_org_abs_2602_19782
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Addressing Instrument-Outcome Confounding in Mendelian Randomization through Representation Learning
Huang, Shimeng
Robinson, Matthew
Locatello, Francesco
Machine Learning
Mendelian Randomization (MR) is a prominent observational epidemiological research method designed to address unobserved confounding when estimating causal effects. However, core assumptions -- particularly the independence between instruments and unobserved confounders -- are often violated due to population stratification or assortative mating. Leveraging the increasing availability of multi-environment data, we propose a representation learning framework that exploits cross-environment invariance to recover latent exogenous components of genetic instruments. We provide theoretical guarantees for identifying these latent instruments under various mixing mechanisms and demonstrate the effectiveness of our approach through simulations and semi-synthetic experiments using data from the All of Us Research Hub.
title Addressing Instrument-Outcome Confounding in Mendelian Randomization through Representation Learning
topic Machine Learning
url https://arxiv.org/abs/2602.19782