Saved in:
| Main Authors: | Kang, Minjae, Kim, Jaehyung |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.06745 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Improving Instruction-Following in Language Models through Activation Steering
by: Stolfo, Alessandro, et al.
Published: (2024)
by: Stolfo, Alessandro, et al.
Published: (2024)
Few-shot Personalization of LLMs with Mis-aligned Responses
by: Kim, Jaehyung, et al.
Published: (2024)
by: Kim, Jaehyung, et al.
Published: (2024)
Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning
by: Seo, Yeongbin, et al.
Published: (2025)
by: Seo, Yeongbin, et al.
Published: (2025)
Learning to Correct for QA Reasoning with Black-box LLMs
by: Kim, Jaehyung, et al.
Published: (2024)
by: Kim, Jaehyung, et al.
Published: (2024)
Dynamically Scaled Activation Steering
by: Ferrando, Alex, et al.
Published: (2025)
by: Ferrando, Alex, et al.
Published: (2025)
Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning
by: Nam, Jaehyun, et al.
Published: (2024)
by: Nam, Jaehyun, et al.
Published: (2024)
Debiasing Online Preference Learning via Preference Feature Preservation
by: Kim, Dongyoung, et al.
Published: (2025)
by: Kim, Dongyoung, et al.
Published: (2025)
Training-free LLM Verification via Recycling Few-shot Examples
by: Lee, Dongseok, et al.
Published: (2025)
by: Lee, Dongseok, et al.
Published: (2025)
Extracting Unlearned Information from LLMs with Activation Steering
by: Seyitoğlu, Atakan, et al.
Published: (2024)
by: Seyitoğlu, Atakan, et al.
Published: (2024)
Global Evolutionary Steering: Refining Activation Steering Control via Cross-Layer Consistency
by: Jiang, Xinyan, et al.
Published: (2026)
by: Jiang, Xinyan, et al.
Published: (2026)
Gap-K%: Measuring Top-1 Prediction Gap for Detecting Pretraining Data
by: Kwak, Minseo, et al.
Published: (2026)
by: Kwak, Minseo, et al.
Published: (2026)
CBMAS: Cognitive Behavioral Modeling via Activation Steering
by: Ismail, Ahmed H., et al.
Published: (2026)
by: Ismail, Ahmed H., et al.
Published: (2026)
Structural Reasoning Improves Molecular Understanding of LLM
by: Jang, Yunhui, et al.
Published: (2024)
by: Jang, Yunhui, et al.
Published: (2024)
Learning from the Undesirable: Robust Adaptation of Language Models without Forgetting
by: Nam, Yunhun, et al.
Published: (2025)
by: Nam, Yunhun, et al.
Published: (2025)
Self-Evolving LLMs via Continual Instruction Tuning
by: Kang, Jiazheng, et al.
Published: (2025)
by: Kang, Jiazheng, et al.
Published: (2025)
Steering LLMs via Scalable Interactive Oversight
by: Zhou, Enyu, et al.
Published: (2026)
by: Zhou, Enyu, et al.
Published: (2026)
Personalized LLM Decoding via Contrasting Personal Preference
by: Bu, Hyungjune, et al.
Published: (2025)
by: Bu, Hyungjune, et al.
Published: (2025)
Is In-Context Learning Sufficient for Instruction Following in LLMs?
by: Zhao, Hao, et al.
Published: (2024)
by: Zhao, Hao, et al.
Published: (2024)
SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models
by: He, Zirui, et al.
Published: (2025)
by: He, Zirui, et al.
Published: (2025)
Hierarchical Meta-Reinforcement Learning via Automated Macro-Action Discovery
by: Cho, Minjae, et al.
Published: (2024)
by: Cho, Minjae, et al.
Published: (2024)
Angular Steering: Behavior Control via Rotation in Activation Space
by: Vu, Hieu M., et al.
Published: (2025)
by: Vu, Hieu M., et al.
Published: (2025)
Tabular Transfer Learning via Prompting LLMs
by: Nam, Jaehyun, et al.
Published: (2024)
by: Nam, Jaehyun, et al.
Published: (2024)
HyperSteer: Activation Steering at Scale with Hypernetworks
by: Sun, Jiuding, et al.
Published: (2025)
by: Sun, Jiuding, et al.
Published: (2025)
Minimizing Collateral Damage in Activation Steering
by: Nguyen, Tam, et al.
Published: (2026)
by: Nguyen, Tam, et al.
Published: (2026)
Steered LLM Activations are Non-Surjective
by: Mishra, Aayush, et al.
Published: (2026)
by: Mishra, Aayush, et al.
Published: (2026)
Activation Steering for Chain-of-Thought Compression
by: Azizi, Seyedarmin, et al.
Published: (2025)
by: Azizi, Seyedarmin, et al.
Published: (2025)
The Amazing Agent Race: Strong Tool Users, Weak Navigators
by: Kim, Zae Myung, et al.
Published: (2026)
by: Kim, Zae Myung, et al.
Published: (2026)
Enhancing and Assessing Instruction-Following with Fine-Grained Instruction Variants
by: Yang, Jiuding, et al.
Published: (2024)
by: Yang, Jiuding, et al.
Published: (2024)
Hierarchical Context Merging: Better Long Context Understanding for Pre-trained LLMs
by: Song, Woomin, et al.
Published: (2024)
by: Song, Woomin, et al.
Published: (2024)
Steer Like the LLM: Activation Steering that Mimics Prompting
by: Heyman, Geert, et al.
Published: (2026)
by: Heyman, Geert, et al.
Published: (2026)
Steer2Adapt: Dynamically Composing Steering Vectors Elicits Efficient Adaptation of LLMs
by: Han, Pengrui, et al.
Published: (2026)
by: Han, Pengrui, et al.
Published: (2026)
Steering Llama 2 via Contrastive Activation Addition
by: Panickssery, Nina, et al.
Published: (2023)
by: Panickssery, Nina, et al.
Published: (2023)
Riemannian Optimization for LoRA on the Stiefel Manifold
by: Park, Juneyoung, et al.
Published: (2025)
by: Park, Juneyoung, et al.
Published: (2025)
Online Continual Learning For Interactive Instruction Following Agents
by: Kim, Byeonghwi, et al.
Published: (2024)
by: Kim, Byeonghwi, et al.
Published: (2024)
The Road Less Traveled: Enhancing Exploration in LLMs via Sequential Sampling
by: Kang, Shijia, et al.
Published: (2025)
by: Kang, Shijia, et al.
Published: (2025)
Light-IF: Endowing LLMs with Generalizable Reasoning via Preview and Self-Checking for Complex Instruction Following
by: Wang, Chenyang, et al.
Published: (2025)
by: Wang, Chenyang, et al.
Published: (2025)
Multi-property Steering of Large Language Models with Dynamic Activation Composition
by: Scalena, Daniel, et al.
Published: (2024)
by: Scalena, Daniel, et al.
Published: (2024)
Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering
by: Bigelow, Eric, et al.
Published: (2025)
by: Bigelow, Eric, et al.
Published: (2025)
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
by: Wu, Xuansheng, et al.
Published: (2023)
by: Wu, Xuansheng, et al.
Published: (2023)
Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control
by: Skifstad, Julian, et al.
Published: (2026)
by: Skifstad, Julian, et al.
Published: (2026)
Similar Items
-
Improving Instruction-Following in Language Models through Activation Steering
by: Stolfo, Alessandro, et al.
Published: (2024) -
Few-shot Personalization of LLMs with Mis-aligned Responses
by: Kim, Jaehyung, et al.
Published: (2024) -
Fast and Fluent Diffusion Language Models via Convolutional Decoding and Rejective Fine-tuning
by: Seo, Yeongbin, et al.
Published: (2025) -
Learning to Correct for QA Reasoning with Black-box LLMs
by: Kim, Jaehyung, et al.
Published: (2024) -
Dynamically Scaled Activation Steering
by: Ferrando, Alex, et al.
Published: (2025)