Saved in:
Bibliographic Details
Main Authors: Hofkes, Matthew, Nychka, Douglas, Cath, Tzahi, Hering, Amanda, McGonagill, Craig
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.03459
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Many industrial and engineering processes monitored as times series have smooth trends that indicate normal behavior and occasionally anomalous patterns that can indicate a problem. This kind of behavior can be modeled by a smooth trend, such as a spline or Gaussian process, and a disruption based on a sparser representation. Our approach is to expand the process signal into two sets of basis functions: one set uses L2 penalties on the coefficients, and the other set uses L1 penalties to control sparsity. From a frequentist perspective, this results in a hybrid smoother that combines cubic smoothing splines and the LASSO. As a Bayesian hierarchical model (BHM), this is equivalent to priors giving a Gaussian process and a Laplace distribution for anomaly coefficients. For the hybrid smoother, we propose two new ways of determining the penalty parameters that use effective degrees of freedom and contrast this with the BHM that uses loosely informative inverse gamma priors. Several reformulations are used to make sampling the BHM posterior more efficient, including some novel features in orthogonalizing and regularizing the model basis functions. This methodology is motivated by a substantive application, offline monitoring of a water treatment process for municipal water filtration. We also test the robustness of these methods with a Monte Carlo study designed to inspect a range trended time series under an array of conditions and compare this new approach to multiple existing modern methods. Both the hybrid smoother and the full BHM give comparable results with small false positive and false negative rates. Besides being successful in the water treatment application, this work can be easily extended to other Gaussian process models and other features that represent process disruptions in offline data.