Saved in:
Bibliographic Details
Main Authors: Tsai, Cheng-Hung, Stajich, Jason
Format: Recurso digital
Language:
Published: Zenodo 2025
Subjects:
Online Access:https://doi.org/10.5281/zenodo.16778021
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866902304397983744
author Tsai, Cheng-Hung
Stajich, Jason
author_facet Tsai, Cheng-Hung
Stajich, Jason
contents Dbcanlight is a lightweight rewrite of a widely used CAZyme annotation tool run_dbcan. It uses pyhmmer, a Cython binding to HMMER3, in place of the HMMER3 CLI suite as the backend for search processes, improving multithreading performance. In addition, it removes a limitation in run_dbcan that required manual splitting of large sequence files beforehand. The main program dbcanlight comprises three modules - build, search and conclude. The build module help to download the required databases from dbcan website; the search module searches against protein HMM, substrate HMM or diamond databases and reports the hits separately; and the conclude module gathers all the results made by each module and provides a summary. The output format closely resembles that of run_dbcan, with minor cleanup. For example, run_dbcan may report the same substrate multiple times for a gene matching several profiles with that substrate, whereas dbcanlight reports it only once. Dbcanlight only reimplemented the core features of run_dbcan, that is searching for CAZyme and substrate matches by hmmer/diamond/dbcansub. Submodules like signalP, CGCFinder, etc. are not implemented.
format Recurso digital
id zenodo_https___doi_org_10_5281_zenodo_16778021
institution Zenodo
language
publishDate 2025
publisher Zenodo
record_format zenodo
spellingShingle Dbcanlight: a lightweight CAZyme annotation tool
Tsai, Cheng-Hung
Stajich, Jason
Genomics
Functional annotation
CAZyme
Dbcanlight is a lightweight rewrite of a widely used CAZyme annotation tool run_dbcan. It uses pyhmmer, a Cython binding to HMMER3, in place of the HMMER3 CLI suite as the backend for search processes, improving multithreading performance. In addition, it removes a limitation in run_dbcan that required manual splitting of large sequence files beforehand. The main program dbcanlight comprises three modules - build, search and conclude. The build module help to download the required databases from dbcan website; the search module searches against protein HMM, substrate HMM or diamond databases and reports the hits separately; and the conclude module gathers all the results made by each module and provides a summary. The output format closely resembles that of run_dbcan, with minor cleanup. For example, run_dbcan may report the same substrate multiple times for a gene matching several profiles with that substrate, whereas dbcanlight reports it only once. Dbcanlight only reimplemented the core features of run_dbcan, that is searching for CAZyme and substrate matches by hmmer/diamond/dbcansub. Submodules like signalP, CGCFinder, etc. are not implemented.
title Dbcanlight: a lightweight CAZyme annotation tool
topic Genomics
Functional annotation
CAZyme
url https://doi.org/10.5281/zenodo.16778021