:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Weixuan, Yang, Jingyuan, Peng, Wei
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2410.12299
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

LF-Steering: Latent Feature Activation Steering for Enhancing Semantic Consistency in Large Language Models
by: Yang, Jingyuan, et al.
Published: (2025)

Beyond Steering Vector: Flow-based Activation Steering for Inference-Time Intervention
by: Jin, Zehao, et al.
Published: (2026)

ExpertSteer: Intervening in LLMs through Expert Knowledge
by: Wang, Weixuan, et al.
Published: (2025)

FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering
by: Li, Yichen, et al.
Published: (2025)

Steering MoE LLMs via Expert (De)Activation
by: Fayyaz, Mohsen, et al.
Published: (2025)

Prompt-Activation Duality: Improving Activation Steering via Attention-Level Interventions
by: Kang, Diancheng, et al.
Published: (2026)

Steer2Adapt: Dynamically Composing Steering Vectors Elicits Efficient Adaptation of LLMs
by: Han, Pengrui, et al.
Published: (2026)

Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs
by: Wang, Weixuan, et al.
Published: (2024)

Mitigating Memorization in LLMs using Activation Steering
by: Suri, Manan, et al.
Published: (2025)

SALSA: Speech Aware LLM Adaptation via Learned Steering Activation Vectors
by: Yegorova, Yekaterina, et al.
Published: (2026)

Steer2Edit: From Activation Steering to Component-Level Editing
by: Sun, Chung-En, et al.
Published: (2026)

Towards Reliable Evaluation of Behavior Steering Interventions in LLMs
by: Pres, Itamar, et al.
Published: (2024)

Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs
by: Yang, Yiheng, et al.
Published: (2025)

Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention
by: Wang, Weixuan, et al.
Published: (2024)

Understanding How CodeLLMs (Mis)Predict Types with Activation Steering
by: Lucchetti, Francesca, et al.
Published: (2024)

Mechanistic Knobs in LLMs: Retrieving and Steering High-Order Semantic Features via Sparse Autoencoders
by: Zhang, Ruikang, et al.
Published: (2026)

Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs
by: Siddique, Zara, et al.
Published: (2025)

Letting Tutor Personas "Speak Up" for LLMs: Learning Steering Vectors from Dialogue via Preference Optimization
by: Lee, Jaewook, et al.
Published: (2026)

Guiding Giants: Lightweight Controllers for Weighted Activation Steering in LLMs
by: Hegazy, Amr, et al.
Published: (2025)

Causal Interventions on Continuous Variables: A Case Study on Verb Bias in Steering Vectors for In-Context Learning
by: Zhou, Zhenghao Herbert, et al.
Published: (2026)

Extracting Unlearned Information from LLMs with Activation Steering
by: Seyitoğlu, Atakan, et al.
Published: (2024)

Enhancing Semantic Consistency of Large Language Models through Model Editing: An Interpretability-Oriented Approach
by: Yang, Jingyuan, et al.
Published: (2025)

Focus On This, Not That! Steering LLMs with Adaptive Feature Specification
by: Lamb, Tom A., et al.
Published: (2024)

BILLY: Steering Large Language Models via Merging Persona Vectors for Creative Generation
by: Pai, Tsung-Min, et al.
Published: (2025)

Fine-Grained Activation Steering: Steering Less, Achieving More
by: Feng, Zijian, et al.
Published: (2026)

Personalized Text Generation with Contrastive Activation Steering
by: Zhang, Jinghao, et al.
Published: (2025)

Activation Steering via Generative Causal Mediation
by: Sankaranarayanan, Aruna, et al.
Published: (2026)

Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs
by: Bhandari, Pranav, et al.
Published: (2025)

RepIt: Steering Language Models with Concept-Specific Refusal Vectors
by: Siu, Vincent, et al.
Published: (2025)

Steering Awareness: Detecting Activation Steering from Within
by: Rivera, Joshua Fonseca, et al.
Published: (2025)

Faithful Bi-Directional Model Steering via Distribution Matching and Distributed Interchange Interventions
by: Bao, Yuntai, et al.
Published: (2026)

Infinite Retrieval: Attention Enhanced LLMs in Long-Context Processing
by: Ye, Xiaoju, et al.
Published: (2025)

VSPO: Vector-Steered Policy Optimization for Behavioral Control
by: Zhang, Xuechen, et al.
Published: (2026)

Analysing the Safety Pitfalls of Steering Vectors
by: Li, Yuxiao, et al.
Published: (2026)

Predicting Where Steering Vectors Succeed
by: Billa, Jayadev
Published: (2026)

HyperSteer: Activation Steering at Scale with Hypernetworks
by: Sun, Jiuding, et al.
Published: (2025)

Steering LLMs for Culturally Localized Generation
by: Khanuja, Simran, et al.
Published: (2026)

Activated Parameter Locating via Causal Intervention for Model Merging
by: Kong, Fanshuang, et al.
Published: (2024)

Conceptors for Semantic Steering
by: Triantafyllopoulos, Ilias, et al.
Published: (2026)

CogSteer: Cognition-Inspired Selective Layer Intervention for Efficiently Steering Large Language Models
by: Wang, Xintong, et al.
Published: (2024)