Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Suzuki, Teppei, Ri, Ryokan, Takase, Sho
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Computation and Language
Online-Zugang:	https://arxiv.org/abs/2504.14871
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866914046650875904
author	Suzuki, Teppei Ri, Ryokan Takase, Sho
author_facet	Suzuki, Teppei Ri, Ryokan Takase, Sho
contents	Recent studies have shown that the outputs from large language models (LLMs) can often reveal the identity of their source model. While this is a natural consequence of LLMs modeling the distribution of their training data, such identifiable traces may also reflect unintended characteristics with potential implications for fairness and misuse. In this work, we go one step further and show that even when LLMs are trained on exactly the same dataset, their outputs remain distinguishable, suggesting that training dynamics alone can leave recognizable patterns. We refer to these unintended, distinctive characteristics as natural fingerprints. By systematically controlling training conditions, we show that the natural fingerprints can emerge from subtle differences in the training process, such as parameter sizes, optimization settings, and even random seeds. These results suggest that training dynamics can systematically shape model behavior, independent of data or architecture, and should be explicitly considered in future research on transparency, reliability, and interpretability.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_14871
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Natural Fingerprints of Large Language Models Suzuki, Teppei Ri, Ryokan Takase, Sho Computation and Language Recent studies have shown that the outputs from large language models (LLMs) can often reveal the identity of their source model. While this is a natural consequence of LLMs modeling the distribution of their training data, such identifiable traces may also reflect unintended characteristics with potential implications for fairness and misuse. In this work, we go one step further and show that even when LLMs are trained on exactly the same dataset, their outputs remain distinguishable, suggesting that training dynamics alone can leave recognizable patterns. We refer to these unintended, distinctive characteristics as natural fingerprints. By systematically controlling training conditions, we show that the natural fingerprints can emerge from subtle differences in the training process, such as parameter sizes, optimization settings, and even random seeds. These results suggest that training dynamics can systematically shape model behavior, independent of data or architecture, and should be explicitly considered in future research on transparency, reliability, and interpretability.
title	Natural Fingerprints of Large Language Models
topic	Computation and Language
url	https://arxiv.org/abs/2504.14871

Ähnliche Einträge