Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Yifeng, Xu, Hanwen, Fang, Tangqi, Xi, Haocheng, Liu, Zixuan, Zhang, Sheng, Poon, Hoifung, Wang, Sheng
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2401.14637
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914653488021504
author	Liu, Yifeng Xu, Hanwen Fang, Tangqi Xi, Haocheng Liu, Zixuan Zhang, Sheng Poon, Hoifung Wang, Sheng
author_facet	Liu, Yifeng Xu, Hanwen Fang, Tangqi Xi, Haocheng Liu, Zixuan Zhang, Sheng Poon, Hoifung Wang, Sheng
contents	As a fundamental task in computational chemistry, retrosynthesis prediction aims to identify a set of reactants to synthesize a target molecule. Existing template-free approaches only consider the graph structures of the target molecule, which often cannot generalize well to rare reaction types and large molecules. Here, we propose T-Rex, a text-assisted retrosynthesis prediction approach that exploits pre-trained text language models, such as ChatGPT, to assist the generation of reactants. T-Rex first exploits ChatGPT to generate a description for the target molecule and rank candidate reaction centers based both the description and the molecular graph. It then re-ranks these candidates by querying the descriptions for each reactants and examines which group of reactants can best synthesize the target molecule. We observed that T-Rex substantially outperformed graph-based state-of-the-art approaches on two datasets, indicating the effectiveness of considering text information. We further found that T-Rex outperformed the variant that only use ChatGPT-based description without the re-ranking step, demonstrate how our framework outperformed a straightforward integration of ChatGPT and graph information. Collectively, we show that text generated by pre-trained language models can substantially improve retrosynthesis prediction, opening up new avenues for exploiting ChatGPT to advance computational chemistry. And the codes can be found at https://github.com/lauyikfung/T-Rex.
format	Preprint
id	arxiv_https___arxiv_org_abs_2401_14637
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	T-Rex: Text-assisted Retrosynthesis Prediction Liu, Yifeng Xu, Hanwen Fang, Tangqi Xi, Haocheng Liu, Zixuan Zhang, Sheng Poon, Hoifung Wang, Sheng Computation and Language As a fundamental task in computational chemistry, retrosynthesis prediction aims to identify a set of reactants to synthesize a target molecule. Existing template-free approaches only consider the graph structures of the target molecule, which often cannot generalize well to rare reaction types and large molecules. Here, we propose T-Rex, a text-assisted retrosynthesis prediction approach that exploits pre-trained text language models, such as ChatGPT, to assist the generation of reactants. T-Rex first exploits ChatGPT to generate a description for the target molecule and rank candidate reaction centers based both the description and the molecular graph. It then re-ranks these candidates by querying the descriptions for each reactants and examines which group of reactants can best synthesize the target molecule. We observed that T-Rex substantially outperformed graph-based state-of-the-art approaches on two datasets, indicating the effectiveness of considering text information. We further found that T-Rex outperformed the variant that only use ChatGPT-based description without the re-ranking step, demonstrate how our framework outperformed a straightforward integration of ChatGPT and graph information. Collectively, we show that text generated by pre-trained language models can substantially improve retrosynthesis prediction, opening up new avenues for exploiting ChatGPT to advance computational chemistry. And the codes can be found at https://github.com/lauyikfung/T-Rex.
title	T-Rex: Text-assisted Retrosynthesis Prediction
topic	Computation and Language
url	https://arxiv.org/abs/2401.14637

Similar Items