Saved in:
Bibliographic Details
Main Authors: Arrubarrena, Paola, Lemercier, Maud, Nikolic, Bojan, Lyons, Terry, Cass, Thomas
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.14892
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913261702610944
author Arrubarrena, Paola
Lemercier, Maud
Nikolic, Bojan
Lyons, Terry
Cass, Thomas
author_facet Arrubarrena, Paola
Lemercier, Maud
Nikolic, Bojan
Lyons, Terry
Cass, Thomas
contents We introduce SigNova, a new semi-supervised framework for detecting anomalies in streamed data. While our initial examples focus on detecting radio-frequency interference (RFI) in digitized signals within the field of radio astronomy, it is important to note that SigNova's applicability extends to any type of streamed data. The framework comprises three primary components. Firstly, we use the signature transform to extract a canonical collection of summary statistics from observational sequences. This allows us to represent variable-length visibility samples as finite-dimensional feature vectors. Secondly, each feature vector is assigned a novelty score, calculated as the Mahalanobis distance to its nearest neighbor in an RFI-free training set. By thresholding these scores we identify observation ranges that deviate from the expected behavior of RFI-free visibility samples without relying on stringent distributional assumptions. Thirdly, we integrate this anomaly detector with Pysegments, a segmentation algorithm, to localize consecutive observations contaminated with RFI, if any. This approach provides a compelling alternative to classical windowing techniques commonly used for RFI detection. Importantly, the complexity of our algorithm depends on the RFI pattern rather than on the size of the observation window. We demonstrate how SigNova improves the detection of various types of RFI (e.g., broadband and narrowband) in time-frequency visibility data. We validate our framework on the Murchison Widefield Array (MWA) telescope and simulated data and the Hydrogen Epoch of Reionization Array (HERA).
format Preprint
id arxiv_https___arxiv_org_abs_2402_14892
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Novelty Detection on Radio Astronomy Data using Signatures
Arrubarrena, Paola
Lemercier, Maud
Nikolic, Bojan
Lyons, Terry
Cass, Thomas
Instrumentation and Methods for Astrophysics
Machine Learning
60L10, 60L20
We introduce SigNova, a new semi-supervised framework for detecting anomalies in streamed data. While our initial examples focus on detecting radio-frequency interference (RFI) in digitized signals within the field of radio astronomy, it is important to note that SigNova's applicability extends to any type of streamed data. The framework comprises three primary components. Firstly, we use the signature transform to extract a canonical collection of summary statistics from observational sequences. This allows us to represent variable-length visibility samples as finite-dimensional feature vectors. Secondly, each feature vector is assigned a novelty score, calculated as the Mahalanobis distance to its nearest neighbor in an RFI-free training set. By thresholding these scores we identify observation ranges that deviate from the expected behavior of RFI-free visibility samples without relying on stringent distributional assumptions. Thirdly, we integrate this anomaly detector with Pysegments, a segmentation algorithm, to localize consecutive observations contaminated with RFI, if any. This approach provides a compelling alternative to classical windowing techniques commonly used for RFI detection. Importantly, the complexity of our algorithm depends on the RFI pattern rather than on the size of the observation window. We demonstrate how SigNova improves the detection of various types of RFI (e.g., broadband and narrowband) in time-frequency visibility data. We validate our framework on the Murchison Widefield Array (MWA) telescope and simulated data and the Hydrogen Epoch of Reionization Array (HERA).
title Novelty Detection on Radio Astronomy Data using Signatures
topic Instrumentation and Methods for Astrophysics
Machine Learning
60L10, 60L20
url https://arxiv.org/abs/2402.14892