Saved in:
Bibliographic Details
Main Author: Daon, Yair
Format: Preprint
Published: 2020
Subjects:
Online Access:https://arxiv.org/abs/2007.12032
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913635194896384
author Daon, Yair
author_facet Daon, Yair
contents Estimation of parameters in physical processes often demands costly measurements, prompting the pursuit of an optimal measurement strategy. Finding such strategy is termed the problem of optimal experimental design, abbreviated as optimal design. Remarkably, optimal designs can yield tightly clustered measurement locations, leading researchers to fundamentally revise the design problem just to circumvent this issue. Some authors introduce error correlation among error terms that are initially independent, while others restrict measurement locations to a finite set of locations. While both approaches may prevent clusterization, they also fundamentally alter the optimal design problem. In this study, we consider Bayesian D-optimal designs, i.e.~designs that maximize the expected Kullback-Leibler divergence between posterior and prior. We propose an analytically tractable model for D-optimal designs over Hilbert spaces. In this framework, we make several key contributions: (a) We establish that measurement clusterization is a generic trait of D-optimal designs for linear inverse problems with independent Gaussian measurement errors and a Gaussian prior. (b) We prove that introducing correlations among measurement error terms mitigates clusterization. (c) We characterize D-optimal designs as reducing uncertainty across a subset of prior covariance eigenvectors. (d) We leverage this characterization to argue that measurement clusterization arises as a consequence of the pigeonhole principle: when more measurements are taken than there are locations where the select eigenvectors are large and others are small -- clusterization occurs. Finally, we use our analysis to argue against the use of Gaussian priors with linearized physical models when seeking a D-optimal design.
format Preprint
id arxiv_https___arxiv_org_abs_2007_12032
institution arXiv
publishDate 2020
record_format arxiv
spellingShingle Clusterization in D-optimal designs: the case against linearization
Daon, Yair
Statistics Theory
62F15, 35R30 (Primary) 28C20 (Secondary)
Estimation of parameters in physical processes often demands costly measurements, prompting the pursuit of an optimal measurement strategy. Finding such strategy is termed the problem of optimal experimental design, abbreviated as optimal design. Remarkably, optimal designs can yield tightly clustered measurement locations, leading researchers to fundamentally revise the design problem just to circumvent this issue. Some authors introduce error correlation among error terms that are initially independent, while others restrict measurement locations to a finite set of locations. While both approaches may prevent clusterization, they also fundamentally alter the optimal design problem. In this study, we consider Bayesian D-optimal designs, i.e.~designs that maximize the expected Kullback-Leibler divergence between posterior and prior. We propose an analytically tractable model for D-optimal designs over Hilbert spaces. In this framework, we make several key contributions: (a) We establish that measurement clusterization is a generic trait of D-optimal designs for linear inverse problems with independent Gaussian measurement errors and a Gaussian prior. (b) We prove that introducing correlations among measurement error terms mitigates clusterization. (c) We characterize D-optimal designs as reducing uncertainty across a subset of prior covariance eigenvectors. (d) We leverage this characterization to argue that measurement clusterization arises as a consequence of the pigeonhole principle: when more measurements are taken than there are locations where the select eigenvectors are large and others are small -- clusterization occurs. Finally, we use our analysis to argue against the use of Gaussian priors with linearized physical models when seeking a D-optimal design.
title Clusterization in D-optimal designs: the case against linearization
topic Statistics Theory
62F15, 35R30 (Primary) 28C20 (Secondary)
url https://arxiv.org/abs/2007.12032