Saved in:
Bibliographic Details
Main Authors: Jenike, Katharine M., Campos-Domínguez, Lucía, Boddé, Marilou, Cerca, José, Hodson, Christina N., Schatz, Michael C., Jaron, Kamil S.
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.01519
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910395497709568
author Jenike, Katharine M.
Campos-Domínguez, Lucía
Boddé, Marilou
Cerca, José
Hodson, Christina N.
Schatz, Michael C.
Jaron, Kamil S.
author_facet Jenike, Katharine M.
Campos-Domínguez, Lucía
Boddé, Marilou
Cerca, José
Hodson, Christina N.
Schatz, Michael C.
Jaron, Kamil S.
contents The wide array of currently available genomes display a wonderful diversity in size, composition and structure with many more to come thanks to several global biodiversity genomics initiatives starting in recent years. However, sequencing of genomes, even with all the recent advances, can still be challenging for both technical (e.g. small physical size, contaminated samples, or access to appropriate sequencing platforms) and biological reasons (e.g. germline restricted DNA, variable ploidy levels, sex chromosomes, or very large genomes). In recent years, k-mer-based techniques have become popular to overcome some of these challenges. They are based on the simple process of dividing the analysed sequences (e.g. raw reads or genomes) into a set of sub-sequences of length k, called k-mers. Despite this apparent simplicity, k-mer-based analysis allows for a rapid and intuitive assessment of complex sequencing datasets. Here, we provide the first comprehensive review to the theoretical properties and practical applications of k-mers in biodiversity genomics, serving as a reference manual for this powerful approach.
format Preprint
id arxiv_https___arxiv_org_abs_2404_01519
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Guide to k-mer approaches for genomics across the tree of life
Jenike, Katharine M.
Campos-Domínguez, Lucía
Boddé, Marilou
Cerca, José
Hodson, Christina N.
Schatz, Michael C.
Jaron, Kamil S.
Genomics
The wide array of currently available genomes display a wonderful diversity in size, composition and structure with many more to come thanks to several global biodiversity genomics initiatives starting in recent years. However, sequencing of genomes, even with all the recent advances, can still be challenging for both technical (e.g. small physical size, contaminated samples, or access to appropriate sequencing platforms) and biological reasons (e.g. germline restricted DNA, variable ploidy levels, sex chromosomes, or very large genomes). In recent years, k-mer-based techniques have become popular to overcome some of these challenges. They are based on the simple process of dividing the analysed sequences (e.g. raw reads or genomes) into a set of sub-sequences of length k, called k-mers. Despite this apparent simplicity, k-mer-based analysis allows for a rapid and intuitive assessment of complex sequencing datasets. Here, we provide the first comprehensive review to the theoretical properties and practical applications of k-mers in biodiversity genomics, serving as a reference manual for this powerful approach.
title Guide to k-mer approaches for genomics across the tree of life
topic Genomics
url https://arxiv.org/abs/2404.01519