Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Li, Jiawen
Format:	Preprint
Published:	2024
Subjects:	Computation Statistics Theory
Online Access:	https://arxiv.org/abs/2410.21922
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

We introduce Prior Knowledge Acceleration (PKA), a batch-update method for variance that reuses previously computed sufficient statistics to avoid full recomputation. The update identity is algebraically equivalent to the pairwise formula of Chan, Golub, and LeVeque (1983); our contribution is a runtime-cost analysis that derives an explicit acceleration factor $τ_a$ and identifies the data-size regime where batch updating outperforms both naïve recomputation and Ross's single-sample method. We prove that Ross's approach is preferable only when the new batch contains a single sample ($N_2 = 1$). We further generalise the framework to covariance and other decomposable statistics. Benchmarks against Welford, Chan pairwise, and naïve two-pass baselines on synthetic and real-world streaming data confirm the theoretical predictions, with speedups of up to $454\times$ when the prior dataset is large relative to the new batch.

Similar Items