Saved in:
Bibliographic Details
Main Authors: Hui, Sun, Yanfeng, Ding, Ma, Huidong, Xu, Chang, Jin, Keyan, Zu, Lizheng, Zhong, Cheng, Liu, xiaoguang, Wang, Gang, Cai, Wentong
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.13559
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911386996572160
author Hui, Sun
Yanfeng, Ding
Ma, Huidong
Xu, Chang
Jin, Keyan
Zu, Lizheng
Zhong, Cheng
Liu, xiaoguang
Wang, Gang
Cai, Wentong
author_facet Hui, Sun
Yanfeng, Ding
Ma, Huidong
Xu, Chang
Jin, Keyan
Zu, Lizheng
Zhong, Cheng
Liu, xiaoguang
Wang, Gang
Cai, Wentong
contents Lossless compression has made significant advancements in Genomics Data (GD) storage, sharing and management. Current learning-based methods are non-evolvable with problems of low-level compression modeling, limited adaptability, and user-unfriendly interface. To this end, we propose AgentGC, the first evolutionary Agent-based GD Compressor, consisting of 3 layers with multi-agent named Leader and Worker. Specifically, the 1) User layer provides a user-friendly interface via Leader combined with LLM; 2) Cognitive layer, driven by the Leader, integrates LLM to consider joint optimization of algorithm-dataset-system, addressing the issues of low-level modeling and limited adaptability; and 3) Compression layer, headed by Worker, performs compression & decompression via a automated multi-knowledge learning-based compression framework. On top of AgentGC, we design 3 modes to support diverse scenarios: CP for compression-ratio priority, TP for throughput priority, and BM for balanced mode. Compared with 14 baselines on 9 datasets, the average compression ratios gains are 16.66%, 16.11%, and 16.33%, the throughput gains are 4.73x, 9.23x, and 9.15x, respectively.
format Preprint
id arxiv_https___arxiv_org_abs_2601_13559
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle AgentGC: Evolutionary Learning-based Lossless Compression for Genomics Data with LLM-driven Multiple Agent
Hui, Sun
Yanfeng, Ding
Ma, Huidong
Xu, Chang
Jin, Keyan
Zu, Lizheng
Zhong, Cheng
Liu, xiaoguang
Wang, Gang
Cai, Wentong
Artificial Intelligence
Lossless compression has made significant advancements in Genomics Data (GD) storage, sharing and management. Current learning-based methods are non-evolvable with problems of low-level compression modeling, limited adaptability, and user-unfriendly interface. To this end, we propose AgentGC, the first evolutionary Agent-based GD Compressor, consisting of 3 layers with multi-agent named Leader and Worker. Specifically, the 1) User layer provides a user-friendly interface via Leader combined with LLM; 2) Cognitive layer, driven by the Leader, integrates LLM to consider joint optimization of algorithm-dataset-system, addressing the issues of low-level modeling and limited adaptability; and 3) Compression layer, headed by Worker, performs compression & decompression via a automated multi-knowledge learning-based compression framework. On top of AgentGC, we design 3 modes to support diverse scenarios: CP for compression-ratio priority, TP for throughput priority, and BM for balanced mode. Compared with 14 baselines on 9 datasets, the average compression ratios gains are 16.66%, 16.11%, and 16.33%, the throughput gains are 4.73x, 9.23x, and 9.15x, respectively.
title AgentGC: Evolutionary Learning-based Lossless Compression for Genomics Data with LLM-driven Multiple Agent
topic Artificial Intelligence
url https://arxiv.org/abs/2601.13559