Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.02360 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914007267409920 |
|---|---|
| author | Young, Adamo Wang, Fei Wishart, David S Wang, Bo Greiner, Russell Röst, Hannes |
| author_facet | Young, Adamo Wang, Fei Wishart, David S Wang, Bo Greiner, Russell Röst, Hannes |
| contents | Compound identification from tandem mass spectrometry (MS/MS) data is a critical step in the analysis of complex mixtures. Typical solutions for the MS/MS spectrum to compound (MS2C) problem involve comparing the unknown spectrum against a library of known spectrum-molecule pairs, an approach that is limited by incomplete library coverage. Compound to MS/MS spectrum (C2MS) models can improve retrieval rates by augmenting real libraries with predicted MS/MS spectra. Unfortunately, many existing C2MS models suffer from problems with mass accuracy, generalization, or interpretability. We develop a new probabilistic method for C2MS prediction, FraGNNet, that can efficiently and accurately simulate MS/MS spectra with high mass accuracy. Our approach formulates the C2MS problem as learning a distribution over molecule fragments. FraGNNet achieves state-of-the-art performance in terms of prediction error and surpasses existing C2MS models as a tool for retrieval-based MS2C. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2404_02360 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | FraGNNet: A Deep Probabilistic Model for Tandem Mass Spectrum Prediction Young, Adamo Wang, Fei Wishart, David S Wang, Bo Greiner, Russell Röst, Hannes Machine Learning Biomolecules Compound identification from tandem mass spectrometry (MS/MS) data is a critical step in the analysis of complex mixtures. Typical solutions for the MS/MS spectrum to compound (MS2C) problem involve comparing the unknown spectrum against a library of known spectrum-molecule pairs, an approach that is limited by incomplete library coverage. Compound to MS/MS spectrum (C2MS) models can improve retrieval rates by augmenting real libraries with predicted MS/MS spectra. Unfortunately, many existing C2MS models suffer from problems with mass accuracy, generalization, or interpretability. We develop a new probabilistic method for C2MS prediction, FraGNNet, that can efficiently and accurately simulate MS/MS spectra with high mass accuracy. Our approach formulates the C2MS problem as learning a distribution over molecule fragments. FraGNNet achieves state-of-the-art performance in terms of prediction error and surpasses existing C2MS models as a tool for retrieval-based MS2C. |
| title | FraGNNet: A Deep Probabilistic Model for Tandem Mass Spectrum Prediction |
| topic | Machine Learning Biomolecules |
| url | https://arxiv.org/abs/2404.02360 |