Salvato in:
Dettagli Bibliografici
Autore principale: Nikitin, Filipp
Natura: Recurso digital
Lingua:
Pubblicazione: Zenodo 2026
Accesso online:https://doi.org/10.5281/zenodo.18943205
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
Sommario:
  • <h2>Release Notes</h2> <h3>Highlights</h3> <ul> <li>Added a <strong>Streamlit web app</strong> for interactive conformer generation, postprocessing, and visualization (<code>app/README.md</code>, <code>app/app.py</code>).</li> <li>Added <strong>flow-matching model support</strong> (<code>data/loqi_flow.ckpt</code>) alongside diffusion (<code>data/loqi.ckpt</code>).</li> <li>Expanded <strong>benchmarking workflows</strong> for speed and quality comparisons across methods.</li> </ul> <h3>Data Releases</h3> <ul> <li><p><strong>Full ChEMBL3D conformer dataset</strong><br> https://kilthub.cmu.edu/articles/dataset/<em>b_ChEMBL3D_Quantum-Accurate_3D_Conformers_for_ChEMBL_at_Scale_b</em>/31428449<br> DOI: https://doi.org/10.1184/R1/31428449</p> </li> <li><p><strong>Processed dataset + LoQI checkpoints (diffusion + flow matching)</strong><br> https://kilthub.cmu.edu/articles/dataset/LoQI_Scalable_Low-Energy_Molecular_Conformer_Generation_with_Quantum_Mechanical_Accuracy/31441570<br> DOI: https://doi.org/10.1184/R1/31441570</p> </li> </ul> <h3>Sampling and Inference Updates</h3> <ul> <li>Updated <code>scripts/sample_conformers.py</code> with:<ul> <li>stricter input validation (including unsupported element/radical checks),</li> <li>optional hydrogen handling for SMILES (<code>--add-hs</code> / <code>--no-add-hs</code>),</li> <li>atom-aware dynamic batching (<code>--atom-aware-batching</code>, <code>--target-molecule-size</code>, <code>--shuffle</code>),</li> <li>postprocessing modes (<code>none</code>, <code>optimization</code>, <code>optimization+irmsd</code>),</li> <li>direct reuse of existing 3D coordinates from SDF inputs.</li> </ul> </li> </ul> <h3>Benchmarking and Evaluation</h3> <ul> <li>Added/updated scripts for:<ul> <li>fixed-size runtime benchmarking (<code>scripts/performance_test.py</code>),</li> <li>model-specific evaluation workflows (RDKit / CONFORGE / MOLTIVERSE pipelines),</li> <li>optimization/timing exports for reproducible comparisons.</li> </ul> </li> </ul> <h3>Model and Dataset Scope</h3> <ul> <li>Supported AIMNet2 elements:<br> <code>H, C, N, O, F, S, Cl, Br, I, B, Si, P, As, Se</code>.</li> <li>Reported full-pipeline curation statistics:<ul> <li>Initial conformers: ~505M</li> <li>Processed conformers: ~280M</li> <li>Unique structures: ~2.4M (~1.8M unique ChEMBL IDs)</li> <li>Conformers with relative energy > 6 kcal/mol were removed.</li> </ul> </li> </ul>