Saved in:
Bibliographic Details
Main Authors: Koseki, Yusuke, Takeshima, Hirohiko, Yoneda, Ryuji, Katayanagi, Kaito, Ito, Gen, Yamanaka, Hiroki
Format: Artículo científico
Language:en
Published: Molecular ecology resources 2025
Subjects:
Online Access:https://pubmed.ncbi.nlm.nih.gov/40755083/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1868266169505415169
author Koseki, Yusuke
Takeshima, Hirohiko
Yoneda, Ryuji
Katayanagi, Kaito
Ito, Gen
Yamanaka, Hiroki
author_facet Koseki, Yusuke
Takeshima, Hirohiko
Yoneda, Ryuji
Katayanagi, Kaito
Ito, Gen
Yamanaka, Hiroki
Koseki, Yusuke
Takeshima, Hirohiko
Yoneda, Ryuji
Katayanagi, Kaito
Ito, Gen
Yamanaka, Hiroki
collection PubMed - marine biology
contents gmmDenoise: A New Method and R Package for High-Confidence Sequence Variant Filtering in Environmental DNA Amplicon Analysis. Koseki, Yusuke Takeshima, Hirohiko Yoneda, Ryuji Katayanagi, Kaito Ito, Gen Yamanaka, Hiroki DNA, Environmental DNA Barcoding, Taxonomic Computational Biology Genetic Variation Metagenomics Software Sequence Analysis, DNA Assessing and monitoring genetic diversity is vital for understanding the ecology and evolution of natural populations but is often challenging in animal and plant species due to technically and physically demanding tissue sampling. Although environmental DNA (eDNA) metabarcoding is a promising alternative to the traditional population genetic monitoring based on biological samples, its practical application remains challenging due to spurious sequences present in the amplicon data, even after data processing with the existing sequence filtering and denoising (error correction) methods. Here we developed a novel amplicon filtering approach that can effectively eliminate such spurious amplicon sequence variants (ASVs) in eDNA metabarcoding data. A simple simulation of eDNA metabarcoding processes was performed to understand the patterns of read count (abundance) distributions of true ASVs and their polymerase chain reaction (PCR)-generated artefacts (i.e., false-positive ASVs). Based on the simulation results, the approach was developed to estimate the abundance distributions of true and false-positive ASVs using Gaussian mixture models and to determine a statistically based threshold between them. The developed approach was implemented as an R package, gmmDenoise and evaluated using single-species metabarcoding datasets in which all or some true ASVs (i.e., haplotypes) were known. Example analyses using community (multi-species) metabarcoding datasets were also performed to demonstrate how gmmDenoise can be used to derive reliable intraspecific diversity estimates and population genetic inferences from noisy amplicon sequencing data. The gmmDenoise package is freely available in the GitHub repository (https://github.com/YSKoseki/gmmDenoise).
format Artículo científico
id pubmed_40755083
institution PubMed
language en
publishDate 2025
publisher Molecular ecology resources
record_format pubmed
spellingShingle gmmDenoise: A New Method and R Package for High-Confidence Sequence Variant Filtering in Environmental DNA Amplicon Analysis.
Koseki, Yusuke
Takeshima, Hirohiko
Yoneda, Ryuji
Katayanagi, Kaito
Ito, Gen
Yamanaka, Hiroki
DNA, Environmental
DNA Barcoding, Taxonomic
Computational Biology
Genetic Variation
Metagenomics
Software
Sequence Analysis, DNA
gmmDenoise: A New Method and R Package for High-Confidence Sequence Variant Filtering in Environmental DNA Amplicon Analysis. Koseki, Yusuke Takeshima, Hirohiko Yoneda, Ryuji Katayanagi, Kaito Ito, Gen Yamanaka, Hiroki DNA, Environmental DNA Barcoding, Taxonomic Computational Biology Genetic Variation Metagenomics Software Sequence Analysis, DNA Assessing and monitoring genetic diversity is vital for understanding the ecology and evolution of natural populations but is often challenging in animal and plant species due to technically and physically demanding tissue sampling. Although environmental DNA (eDNA) metabarcoding is a promising alternative to the traditional population genetic monitoring based on biological samples, its practical application remains challenging due to spurious sequences present in the amplicon data, even after data processing with the existing sequence filtering and denoising (error correction) methods. Here we developed a novel amplicon filtering approach that can effectively eliminate such spurious amplicon sequence variants (ASVs) in eDNA metabarcoding data. A simple simulation of eDNA metabarcoding processes was performed to understand the patterns of read count (abundance) distributions of true ASVs and their polymerase chain reaction (PCR)-generated artefacts (i.e., false-positive ASVs). Based on the simulation results, the approach was developed to estimate the abundance distributions of true and false-positive ASVs using Gaussian mixture models and to determine a statistically based threshold between them. The developed approach was implemented as an R package, gmmDenoise and evaluated using single-species metabarcoding datasets in which all or some true ASVs (i.e., haplotypes) were known. Example analyses using community (multi-species) metabarcoding datasets were also performed to demonstrate how gmmDenoise can be used to derive reliable intraspecific diversity estimates and population genetic inferences from noisy amplicon sequencing data. The gmmDenoise package is freely available in the GitHub repository (https://github.com/YSKoseki/gmmDenoise).
title gmmDenoise: A New Method and R Package for High-Confidence Sequence Variant Filtering in Environmental DNA Amplicon Analysis.
topic DNA, Environmental
DNA Barcoding, Taxonomic
Computational Biology
Genetic Variation
Metagenomics
Software
Sequence Analysis, DNA
url https://pubmed.ncbi.nlm.nih.gov/40755083/