Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Pohle, Marc-Oliver, Dimitriadis, Timo, Wermuth, Jan-Lukas
Format:	Preprint
Published:	2024
Subjects:	Methodology
Online Access:	https://arxiv.org/abs/2403.17580
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909898700226560
author	Pohle, Marc-Oliver Dimitriadis, Timo Wermuth, Jan-Lukas
author_facet	Pohle, Marc-Oliver Dimitriadis, Timo Wermuth, Jan-Lukas
contents	Measuring dependence between two events, or equivalently between two binary random variables, amounts to expressing the dependence structure inherent in a $2\times 2$ contingency table in a real number between $-1$ and $1$. Countless such dependence measures exist, but there is little theoretical guidance on how they compare and on their advantages and shortcomings. Thus, practitioners might be overwhelmed by the problem of choosing a suitable measure. We provide a set of natural desirable properties that a proper dependence measure should fulfill. We show that Yule's Q and the little-known Cole coefficient are proper, while the most widely-used measures, the phi coefficient and all contingency coefficients, are improper. They have a severe attainability problem, that is, even under perfect dependence they can be very far away from $-1$ and $1$, and often differ substantially from the proper measures in that they understate strength of dependence. The structural reason is that these are measures for equality of events rather than of dependence. We derive the (in some instances non-standard) limiting distributions of the measures and illustrate how asymptotically valid confidence intervals can be constructed. In a case study on drug consumption we demonstrate how misleading conclusions may arise from the use of improper dependence measures.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_17580
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Measuring Dependence between Events Pohle, Marc-Oliver Dimitriadis, Timo Wermuth, Jan-Lukas Methodology Measuring dependence between two events, or equivalently between two binary random variables, amounts to expressing the dependence structure inherent in a $2\times 2$ contingency table in a real number between $-1$ and $1$. Countless such dependence measures exist, but there is little theoretical guidance on how they compare and on their advantages and shortcomings. Thus, practitioners might be overwhelmed by the problem of choosing a suitable measure. We provide a set of natural desirable properties that a proper dependence measure should fulfill. We show that Yule's Q and the little-known Cole coefficient are proper, while the most widely-used measures, the phi coefficient and all contingency coefficients, are improper. They have a severe attainability problem, that is, even under perfect dependence they can be very far away from $-1$ and $1$, and often differ substantially from the proper measures in that they understate strength of dependence. The structural reason is that these are measures for equality of events rather than of dependence. We derive the (in some instances non-standard) limiting distributions of the measures and illustrate how asymptotically valid confidence intervals can be constructed. In a case study on drug consumption we demonstrate how misleading conclusions may arise from the use of improper dependence measures.
title	Measuring Dependence between Events
topic	Methodology
url	https://arxiv.org/abs/2403.17580

Similar Items