Saved in:
Bibliographic Details
Main Authors: Drechsel, Jonathan, Herbold, Steffen
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.23993
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • We present gradiend, an open-source Python package that operationalizes the GRADIEND method for learning feature directions from factual-counterfactual MLM and CLM gradients in language models. The package provides a unified workflow for feature-related data creation, training, evaluation, visualization, persistent model rewriting via controlled weight updates, and multi-feature comparison. We demonstrate GRADIEND on an English pronoun paradigm and on a large-scale feature comparison that reproduces prior use cases.