Saved in:
| Main Authors: | Wang, Weixuan, Yang, Jingyuan, Peng, Wei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.12299 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LF-Steering: Latent Feature Activation Steering for Enhancing Semantic Consistency in Large Language Models
by: Yang, Jingyuan, et al.
Published: (2025)
by: Yang, Jingyuan, et al.
Published: (2025)
Beyond Steering Vector: Flow-based Activation Steering for Inference-Time Intervention
by: Jin, Zehao, et al.
Published: (2026)
by: Jin, Zehao, et al.
Published: (2026)
ExpertSteer: Intervening in LLMs through Expert Knowledge
by: Wang, Weixuan, et al.
Published: (2025)
by: Wang, Weixuan, et al.
Published: (2025)
FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering
by: Li, Yichen, et al.
Published: (2025)
by: Li, Yichen, et al.
Published: (2025)
Steering MoE LLMs via Expert (De)Activation
by: Fayyaz, Mohsen, et al.
Published: (2025)
by: Fayyaz, Mohsen, et al.
Published: (2025)
Prompt-Activation Duality: Improving Activation Steering via Attention-Level Interventions
by: Kang, Diancheng, et al.
Published: (2026)
by: Kang, Diancheng, et al.
Published: (2026)
Steer2Adapt: Dynamically Composing Steering Vectors Elicits Efficient Adaptation of LLMs
by: Han, Pengrui, et al.
Published: (2026)
by: Han, Pengrui, et al.
Published: (2026)
Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs
by: Wang, Weixuan, et al.
Published: (2024)
by: Wang, Weixuan, et al.
Published: (2024)
Mitigating Memorization in LLMs using Activation Steering
by: Suri, Manan, et al.
Published: (2025)
by: Suri, Manan, et al.
Published: (2025)
SALSA: Speech Aware LLM Adaptation via Learned Steering Activation Vectors
by: Yegorova, Yekaterina, et al.
Published: (2026)
by: Yegorova, Yekaterina, et al.
Published: (2026)
Steer2Edit: From Activation Steering to Component-Level Editing
by: Sun, Chung-En, et al.
Published: (2026)
by: Sun, Chung-En, et al.
Published: (2026)
Towards Reliable Evaluation of Behavior Steering Interventions in LLMs
by: Pres, Itamar, et al.
Published: (2024)
by: Pres, Itamar, et al.
Published: (2024)
Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs
by: Yang, Yiheng, et al.
Published: (2025)
by: Yang, Yiheng, et al.
Published: (2025)
Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention
by: Wang, Weixuan, et al.
Published: (2024)
by: Wang, Weixuan, et al.
Published: (2024)
Understanding How CodeLLMs (Mis)Predict Types with Activation Steering
by: Lucchetti, Francesca, et al.
Published: (2024)
by: Lucchetti, Francesca, et al.
Published: (2024)
Mechanistic Knobs in LLMs: Retrieving and Steering High-Order Semantic Features via Sparse Autoencoders
by: Zhang, Ruikang, et al.
Published: (2026)
by: Zhang, Ruikang, et al.
Published: (2026)
Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs
by: Siddique, Zara, et al.
Published: (2025)
by: Siddique, Zara, et al.
Published: (2025)
Letting Tutor Personas "Speak Up" for LLMs: Learning Steering Vectors from Dialogue via Preference Optimization
by: Lee, Jaewook, et al.
Published: (2026)
by: Lee, Jaewook, et al.
Published: (2026)
Guiding Giants: Lightweight Controllers for Weighted Activation Steering in LLMs
by: Hegazy, Amr, et al.
Published: (2025)
by: Hegazy, Amr, et al.
Published: (2025)
Causal Interventions on Continuous Variables: A Case Study on Verb Bias in Steering Vectors for In-Context Learning
by: Zhou, Zhenghao Herbert, et al.
Published: (2026)
by: Zhou, Zhenghao Herbert, et al.
Published: (2026)
Extracting Unlearned Information from LLMs with Activation Steering
by: Seyitoğlu, Atakan, et al.
Published: (2024)
by: Seyitoğlu, Atakan, et al.
Published: (2024)
Enhancing Semantic Consistency of Large Language Models through Model Editing: An Interpretability-Oriented Approach
by: Yang, Jingyuan, et al.
Published: (2025)
by: Yang, Jingyuan, et al.
Published: (2025)
Focus On This, Not That! Steering LLMs with Adaptive Feature Specification
by: Lamb, Tom A., et al.
Published: (2024)
by: Lamb, Tom A., et al.
Published: (2024)
BILLY: Steering Large Language Models via Merging Persona Vectors for Creative Generation
by: Pai, Tsung-Min, et al.
Published: (2025)
by: Pai, Tsung-Min, et al.
Published: (2025)
Fine-Grained Activation Steering: Steering Less, Achieving More
by: Feng, Zijian, et al.
Published: (2026)
by: Feng, Zijian, et al.
Published: (2026)
Personalized Text Generation with Contrastive Activation Steering
by: Zhang, Jinghao, et al.
Published: (2025)
by: Zhang, Jinghao, et al.
Published: (2025)
Activation Steering via Generative Causal Mediation
by: Sankaranarayanan, Aruna, et al.
Published: (2026)
by: Sankaranarayanan, Aruna, et al.
Published: (2026)
Activation-Space Personality Steering: Hybrid Layer Selection for Stable Trait Control in LLMs
by: Bhandari, Pranav, et al.
Published: (2025)
by: Bhandari, Pranav, et al.
Published: (2025)
RepIt: Steering Language Models with Concept-Specific Refusal Vectors
by: Siu, Vincent, et al.
Published: (2025)
by: Siu, Vincent, et al.
Published: (2025)
Steering Awareness: Detecting Activation Steering from Within
by: Rivera, Joshua Fonseca, et al.
Published: (2025)
by: Rivera, Joshua Fonseca, et al.
Published: (2025)
Faithful Bi-Directional Model Steering via Distribution Matching and Distributed Interchange Interventions
by: Bao, Yuntai, et al.
Published: (2026)
by: Bao, Yuntai, et al.
Published: (2026)
Infinite Retrieval: Attention Enhanced LLMs in Long-Context Processing
by: Ye, Xiaoju, et al.
Published: (2025)
by: Ye, Xiaoju, et al.
Published: (2025)
VSPO: Vector-Steered Policy Optimization for Behavioral Control
by: Zhang, Xuechen, et al.
Published: (2026)
by: Zhang, Xuechen, et al.
Published: (2026)
Analysing the Safety Pitfalls of Steering Vectors
by: Li, Yuxiao, et al.
Published: (2026)
by: Li, Yuxiao, et al.
Published: (2026)
Predicting Where Steering Vectors Succeed
by: Billa, Jayadev
Published: (2026)
by: Billa, Jayadev
Published: (2026)
HyperSteer: Activation Steering at Scale with Hypernetworks
by: Sun, Jiuding, et al.
Published: (2025)
by: Sun, Jiuding, et al.
Published: (2025)
Steering LLMs for Culturally Localized Generation
by: Khanuja, Simran, et al.
Published: (2026)
by: Khanuja, Simran, et al.
Published: (2026)
Activated Parameter Locating via Causal Intervention for Model Merging
by: Kong, Fanshuang, et al.
Published: (2024)
by: Kong, Fanshuang, et al.
Published: (2024)
Conceptors for Semantic Steering
by: Triantafyllopoulos, Ilias, et al.
Published: (2026)
by: Triantafyllopoulos, Ilias, et al.
Published: (2026)
CogSteer: Cognition-Inspired Selective Layer Intervention for Efficiently Steering Large Language Models
by: Wang, Xintong, et al.
Published: (2024)
by: Wang, Xintong, et al.
Published: (2024)
Similar Items
-
LF-Steering: Latent Feature Activation Steering for Enhancing Semantic Consistency in Large Language Models
by: Yang, Jingyuan, et al.
Published: (2025) -
Beyond Steering Vector: Flow-based Activation Steering for Inference-Time Intervention
by: Jin, Zehao, et al.
Published: (2026) -
ExpertSteer: Intervening in LLMs through Expert Knowledge
by: Wang, Weixuan, et al.
Published: (2025) -
FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering
by: Li, Yichen, et al.
Published: (2025) -
Steering MoE LLMs via Expert (De)Activation
by: Fayyaz, Mohsen, et al.
Published: (2025)