Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.24089 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866913064583954432 |
|---|---|
| author | Shahane, Aditya Hemant Sirohi, Anuj Kumar Arora, Devansh Kumar, Nitin P, Prathosh A Kumar, Sandeep |
| author_facet | Shahane, Aditya Hemant Sirohi, Anuj Kumar Arora, Devansh Kumar, Nitin P, Prathosh A Kumar, Sandeep |
| contents | Bridging molecular structures and natural language is essential for controllable design. Autoregressive models struggle with long-range dependencies, while standard diffusion processes apply uniform corruption across positions, which can distort structurally informative tokens. We present BiMol-Diff, a unified diffusion framework for the paired tasks of text-conditioned molecule generation and molecule captioning. Our key component is a token-aware noise schedule that assigns position-dependent corruption based on token recovery difficulty, preserving harder-to-recover substructures during the forward process. On ChEBI-20 and M3-20M, BiMol-Diff improves molecule reconstruction with a 15.4% relative gain in Exact Match and achieves strong captioning results, attaining best BLEU and BERTScore among compared baselines. These results indicate token-aware noising improves fidelity in molecular structure-language modelling. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2604_24089 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning Shahane, Aditya Hemant Sirohi, Anuj Kumar Arora, Devansh Kumar, Nitin P, Prathosh A Kumar, Sandeep Computation and Language Bridging molecular structures and natural language is essential for controllable design. Autoregressive models struggle with long-range dependencies, while standard diffusion processes apply uniform corruption across positions, which can distort structurally informative tokens. We present BiMol-Diff, a unified diffusion framework for the paired tasks of text-conditioned molecule generation and molecule captioning. Our key component is a token-aware noise schedule that assigns position-dependent corruption based on token recovery difficulty, preserving harder-to-recover substructures during the forward process. On ChEBI-20 and M3-20M, BiMol-Diff improves molecule reconstruction with a 15.4% relative gain in Exact Match and achieves strong captioning results, attaining best BLEU and BERTScore among compared baselines. These results indicate token-aware noising improves fidelity in molecular structure-language modelling. |
| title | BiMol-Diff: A Unified Diffusion Framework for Molecular Generation and Captioning |
| topic | Computation and Language |
| url | https://arxiv.org/abs/2604.24089 |