Saved in:
Bibliographic Details
Main Author: Becker, Devan
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2508.03508
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913976342806528
author Becker, Devan
author_facet Becker, Devan
contents SARS-CoV-2 lineages are defined according to placement in a phylogenetic tree, but approximated by a list of mutations based on sequences collected from clinical sampling. Wastewater lineage abundance is generally found under the assumption that the mutation frequency is approximately equal to the sum of the abundances of the lineages to which it belongs. By leveraging numerous samples collected over time, I am able to estimate the temporal trends of the abundance of lineages as well as the definitions of those lineages. This is accomplished by assuming that collections of mutations that appear together over time constitute lineages, then estimating the proportions as before. Three main models are considered: Two that incorporate an explicit temporal trend with different constraints on the abundances, and one that does not estimate a temporal component. It is found that estimated lineage definitions correspond to known lineage definitions with matching temporal trends for the lineage abundances, despite having no information from clinical samples. I refer to this set of methods as "UnMuted" since the mutations are allowed to speak for themselves.
format Preprint
id arxiv_https___arxiv_org_abs_2508_03508
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle UnMuted: Defining SARS-CoV-2 Lineages According to Temporally Consistent Mutation Clusters in Wastewater Samples
Becker, Devan
Methodology
SARS-CoV-2 lineages are defined according to placement in a phylogenetic tree, but approximated by a list of mutations based on sequences collected from clinical sampling. Wastewater lineage abundance is generally found under the assumption that the mutation frequency is approximately equal to the sum of the abundances of the lineages to which it belongs. By leveraging numerous samples collected over time, I am able to estimate the temporal trends of the abundance of lineages as well as the definitions of those lineages. This is accomplished by assuming that collections of mutations that appear together over time constitute lineages, then estimating the proportions as before. Three main models are considered: Two that incorporate an explicit temporal trend with different constraints on the abundances, and one that does not estimate a temporal component. It is found that estimated lineage definitions correspond to known lineage definitions with matching temporal trends for the lineage abundances, despite having no information from clinical samples. I refer to this set of methods as "UnMuted" since the mutations are allowed to speak for themselves.
title UnMuted: Defining SARS-CoV-2 Lineages According to Temporally Consistent Mutation Clusters in Wastewater Samples
topic Methodology
url https://arxiv.org/abs/2508.03508