Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Armstrong, Michael Sorochan
Format:	Preprint
Published:	2025
Subjects:	Numerical Analysis Machine Learning
Online Access:	https://arxiv.org/abs/2502.12810
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916619396055040
author	Armstrong, Michael Sorochan
author_facet	Armstrong, Michael Sorochan
contents	Multidimensional separations data have the capacity to reveal detailed information about complex biological samples. However, data analysis has been an ongoing challenge in the area since the peaks that represent chemical factors may drift over the course of several analytical runs along the first and second dimension retention times. This makes higher-level analyses of the data difficult, since a 1-1 comparison of samples is seldom possible without sophisticated pre-processing routines. Further complicating the issue is the fact that closely co-eluting components will need to be resolved, typically using some variants of Parallel Factor Analysis (PARAFAC), Multivariate Curve Resolution (MCR), or the recently explored Shift-Invariant Multi-linearity. These algorithms work with a user-specified number of components, and regions of interest that are then summarized as a peak table that is invariant to shift. However, identifying regions of interest across truly heterogeneous data remains an ongoing issue, for automated deployment of these algorithms. This work offers a very simple solution to the alignment problem through a orthogonal Procrustes analysis of the frequency-domain representation of synthetic multidimensional separations data, for peaks that are logarithmically transformed to simulate shift while preserving the underlying topology of the data. Using this very simple method for analysis, two synthetic chromatograms can be compared under close to the worst possible scenarios for alignment.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_12810
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Frequency-domain alignment of heterogeneous, multidimensional separations data through complex orthogonal Procrustes analysis Armstrong, Michael Sorochan Numerical Analysis Machine Learning Multidimensional separations data have the capacity to reveal detailed information about complex biological samples. However, data analysis has been an ongoing challenge in the area since the peaks that represent chemical factors may drift over the course of several analytical runs along the first and second dimension retention times. This makes higher-level analyses of the data difficult, since a 1-1 comparison of samples is seldom possible without sophisticated pre-processing routines. Further complicating the issue is the fact that closely co-eluting components will need to be resolved, typically using some variants of Parallel Factor Analysis (PARAFAC), Multivariate Curve Resolution (MCR), or the recently explored Shift-Invariant Multi-linearity. These algorithms work with a user-specified number of components, and regions of interest that are then summarized as a peak table that is invariant to shift. However, identifying regions of interest across truly heterogeneous data remains an ongoing issue, for automated deployment of these algorithms. This work offers a very simple solution to the alignment problem through a orthogonal Procrustes analysis of the frequency-domain representation of synthetic multidimensional separations data, for peaks that are logarithmically transformed to simulate shift while preserving the underlying topology of the data. Using this very simple method for analysis, two synthetic chromatograms can be compared under close to the worst possible scenarios for alignment.
title	Frequency-domain alignment of heterogeneous, multidimensional separations data through complex orthogonal Procrustes analysis
topic	Numerical Analysis Machine Learning
url	https://arxiv.org/abs/2502.12810

Similar Items