Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.03547 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866912941750616064 |
|---|---|
| author | Schmitt, Matthew S Lee, Kiseok Bunbury, Freddy Landsittel, Joseph A Vitelli, Vincenzo Kuehn, Seppe |
| author_facet | Schmitt, Matthew S Lee, Kiseok Bunbury, Freddy Landsittel, Joseph A Vitelli, Vincenzo Kuehn, Seppe |
| contents | From soil to the gut, communities composed of thousands of microbes perform functions such as carbon sequestration and immune system regulation. Here, we introduce a data-driven approach that explains how community function can be traced to just a few groups of microbes or genes. In gut communities, our neural-network based clustering algorithm correctly recovers known functional groups. In the ocean metagenome, it distills ~500 gene modules down to three sparse groups highlighting survival strategies at different depths. In soils, it distills ~4400 bacterial species into two groups that enter a mathematical model of nitrate metabolism. By combining interpretable ML with strain isolation and sequencing experiments, we connect the metabolic specialization of each group to community-wide responses to perturbations. This integrated approach yields simple structure-function maps of microbiomes, allowing the discovery of molecular mechanisms underlying human and environmental health. More broadly, we illustrate how to do function-informed dimensionality reduction in biology. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2603_03547 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Learning functional groups in complex microbiomes Schmitt, Matthew S Lee, Kiseok Bunbury, Freddy Landsittel, Joseph A Vitelli, Vincenzo Kuehn, Seppe Biological Physics Genomics From soil to the gut, communities composed of thousands of microbes perform functions such as carbon sequestration and immune system regulation. Here, we introduce a data-driven approach that explains how community function can be traced to just a few groups of microbes or genes. In gut communities, our neural-network based clustering algorithm correctly recovers known functional groups. In the ocean metagenome, it distills ~500 gene modules down to three sparse groups highlighting survival strategies at different depths. In soils, it distills ~4400 bacterial species into two groups that enter a mathematical model of nitrate metabolism. By combining interpretable ML with strain isolation and sequencing experiments, we connect the metabolic specialization of each group to community-wide responses to perturbations. This integrated approach yields simple structure-function maps of microbiomes, allowing the discovery of molecular mechanisms underlying human and environmental health. More broadly, we illustrate how to do function-informed dimensionality reduction in biology. |
| title | Learning functional groups in complex microbiomes |
| topic | Biological Physics Genomics |
| url | https://arxiv.org/abs/2603.03547 |