Guardado en:
Detalles Bibliográficos
Autores principales: Liu, Weiqi, Cao, Fenglei, Qi, Yuan, Xu, Li-Cheng
Formato: Preprint
Publicado: 2026
Materias:
Acceso en línea:https://arxiv.org/abs/2601.03689
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866917187902504960
author Liu, Weiqi
Cao, Fenglei
Qi, Yuan
Xu, Li-Cheng
author_facet Liu, Weiqi
Cao, Fenglei
Qi, Yuan
Xu, Li-Cheng
contents With the rise of data-driven reaction prediction models, effective reaction descriptors are crucial for bridging the gap between real-world chemistry and digital representations. However, general-purpose, reaction-wise descriptors remain scarce. This study introduces RXNEmb, a novel reaction-level descriptor derived from RXNGraphormer, a model pre-trained to distinguish real reactions from fictitious ones with erroneous bond changes, thereby learning intrinsic bond formation and cleavage patterns. We demonstrate its utility by data-driven re-clustering of the USPTO-50k dataset, yielding a classification that more directly reflects bond-change similarities than rule-based categories. Combined with dimensionality reduction, RXNEmb enables visualization of reaction space diversity. Furthermore, attention weight analysis reveals the model's focus on chemically critical sites, providing mechanistic insight. RXNEmb serves as a powerful, interpretable tool for reaction fingerprinting and analysis, paving the way for more data-centric approaches in reaction analysis and discovery.
format Preprint
id arxiv_https___arxiv_org_abs_2601_03689
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle A Pre-trained Reaction Embedding Descriptor Capturing Bond Transformation Patterns
Liu, Weiqi
Cao, Fenglei
Qi, Yuan
Xu, Li-Cheng
Machine Learning
Artificial Intelligence
Chemical Physics
With the rise of data-driven reaction prediction models, effective reaction descriptors are crucial for bridging the gap between real-world chemistry and digital representations. However, general-purpose, reaction-wise descriptors remain scarce. This study introduces RXNEmb, a novel reaction-level descriptor derived from RXNGraphormer, a model pre-trained to distinguish real reactions from fictitious ones with erroneous bond changes, thereby learning intrinsic bond formation and cleavage patterns. We demonstrate its utility by data-driven re-clustering of the USPTO-50k dataset, yielding a classification that more directly reflects bond-change similarities than rule-based categories. Combined with dimensionality reduction, RXNEmb enables visualization of reaction space diversity. Furthermore, attention weight analysis reveals the model's focus on chemically critical sites, providing mechanistic insight. RXNEmb serves as a powerful, interpretable tool for reaction fingerprinting and analysis, paving the way for more data-centric approaches in reaction analysis and discovery.
title A Pre-trained Reaction Embedding Descriptor Capturing Bond Transformation Patterns
topic Machine Learning
Artificial Intelligence
Chemical Physics
url https://arxiv.org/abs/2601.03689