Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Leshin, Jonah, Shah, Manish, Timmis, Ian
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2606.02536
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917556455997440
author	Leshin, Jonah Shah, Manish Timmis, Ian
author_facet	Leshin, Jonah Shah, Manish Timmis, Ian
contents	Text files such as skill files, memory files, and behavioral configuration files play a central role in defining how modern agents act. Through edits by humans or the agents themselves, these files may evolve over time, directly steering the agent's behavior in future interactions. We present a methodology and framework for measuring agent $traits$ by defining traits as directions in the embedding space of a text embedding model. We train a linear model on labeled "before" versus "after" skill file diffs to learn a trait vector, then score arbitrary skill edits by projecting their embedding diffs onto this vector. Evaluated on 68 labeled skill diff pairs for the trait of propensity to seek sensitive data, our method achieves 91.2% sign classification accuracy and a Spearman rank correlation of $ρ= 0.82$ under leave-one-out cross-validation. We build this trait evaluation into a broader agent-to-agent protocol that enables one agent to evaluate another's skill file updates through a trusted intermediary.
format	Preprint
id	arxiv_https___arxiv_org_abs_2606_02536
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Tracking the Behavioral Trajectories of Adapting Agents Leshin, Jonah Shah, Manish Timmis, Ian Artificial Intelligence Text files such as skill files, memory files, and behavioral configuration files play a central role in defining how modern agents act. Through edits by humans or the agents themselves, these files may evolve over time, directly steering the agent's behavior in future interactions. We present a methodology and framework for measuring agent $traits$ by defining traits as directions in the embedding space of a text embedding model. We train a linear model on labeled "before" versus "after" skill file diffs to learn a trait vector, then score arbitrary skill edits by projecting their embedding diffs onto this vector. Evaluated on 68 labeled skill diff pairs for the trait of propensity to seek sensitive data, our method achieves 91.2% sign classification accuracy and a Spearman rank correlation of $ρ= 0.82$ under leave-one-out cross-validation. We build this trait evaluation into a broader agent-to-agent protocol that enables one agent to evaluate another's skill file updates through a trusted intermediary.
title	Tracking the Behavioral Trajectories of Adapting Agents
topic	Artificial Intelligence
url	https://arxiv.org/abs/2606.02536

Similar Items