Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Khwaja, Basil Hasan, Chen, Dylan, Toor, Guntas, Kuznetsova, Anastasiya
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2603.03407
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914367669272576
author	Khwaja, Basil Hasan Chen, Dylan Toor, Guntas Kuznetsova, Anastasiya
author_facet	Khwaja, Basil Hasan Chen, Dylan Toor, Guntas Kuznetsova, Anastasiya
contents	Large language models (LLMs) have shown strong empirical performance across pharmacology and drug discovery tasks, yet the internal mechanisms by which they encode pharmacological knowledge remain poorly understood. In this work, we investigate how drug-group semantics are represented and retrieved within Llama-based biomedical language models using causal and probing-based interpretability methods. We apply activation patching to localize where drug-group information is stored across model layers and token positions, and complement this analysis with linear probes trained on token-level and sum-pooled activations. Our results demonstrate that early layers play a key role in encoding drug-group knowledge, with the strongest causal effects arising from intermediate tokens within the drug-group span rather than the final drug-group token. Linear probing further reveals that pharmacological semantics are distributed across tokens and are already present in the embedding space, with token-level probes performing near chance while sum-pooled representations achieve maximal accuracy. Together, these findings suggest that drug-group semantics in LLMs are not localized to single tokens but instead arise from distributed representations. This study provides the first systematic mechanistic analysis of pharmacological knowledge in LLMs, offering insights into how biomedical semantics are encoded in large language models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_03407
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Tracing Pharmacological Knowledge In Large Language Models Khwaja, Basil Hasan Chen, Dylan Toor, Guntas Kuznetsova, Anastasiya Computation and Language Large language models (LLMs) have shown strong empirical performance across pharmacology and drug discovery tasks, yet the internal mechanisms by which they encode pharmacological knowledge remain poorly understood. In this work, we investigate how drug-group semantics are represented and retrieved within Llama-based biomedical language models using causal and probing-based interpretability methods. We apply activation patching to localize where drug-group information is stored across model layers and token positions, and complement this analysis with linear probes trained on token-level and sum-pooled activations. Our results demonstrate that early layers play a key role in encoding drug-group knowledge, with the strongest causal effects arising from intermediate tokens within the drug-group span rather than the final drug-group token. Linear probing further reveals that pharmacological semantics are distributed across tokens and are already present in the embedding space, with token-level probes performing near chance while sum-pooled representations achieve maximal accuracy. Together, these findings suggest that drug-group semantics in LLMs are not localized to single tokens but instead arise from distributed representations. This study provides the first systematic mechanistic analysis of pharmacological knowledge in LLMs, offering insights into how biomedical semantics are encoded in large language models.
title	Tracing Pharmacological Knowledge In Large Language Models
topic	Computation and Language
url	https://arxiv.org/abs/2603.03407

Similar Items