Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Dasgupta, Subhadra, Dette, Holger
Format:	Preprint
Published:	2023
Subjects:	Methodology
Online Access:	https://arxiv.org/abs/2306.16821
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914718882463744
author	Dasgupta, Subhadra Dette, Holger
author_facet	Dasgupta, Subhadra Dette, Holger
contents	We propose a novel two-stage subsampling algorithm based on optimal design principles. In the first stage, we use a density-based clustering algorithm to identify an approximating design space for the predictors from an initial subsample. Next, we determine an optimal approximate design on this design space. Finally, we use matrix distances such as the Procrustes, Frobenius, and square-root distance to define the remaining subsample, such that its points are "closest" to the support points of the optimal design. Our approach reflects the specific nature of the information matrix as a weighted sum of non-negative definite Fisher information matrices evaluated at the design points and applies to a large class of regression models including models where the Fisher information is of rank larger than $1$.
format	Preprint
id	arxiv_https___arxiv_org_abs_2306_16821
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Efficient subsampling for exponential family models Dasgupta, Subhadra Dette, Holger Methodology We propose a novel two-stage subsampling algorithm based on optimal design principles. In the first stage, we use a density-based clustering algorithm to identify an approximating design space for the predictors from an initial subsample. Next, we determine an optimal approximate design on this design space. Finally, we use matrix distances such as the Procrustes, Frobenius, and square-root distance to define the remaining subsample, such that its points are "closest" to the support points of the optimal design. Our approach reflects the specific nature of the information matrix as a weighted sum of non-negative definite Fisher information matrices evaluated at the design points and applies to a large class of regression models including models where the Fisher information is of rank larger than $1$.
title	Efficient subsampling for exponential family models
topic	Methodology
url	https://arxiv.org/abs/2306.16821

Similar Items