Han, P., Qian, C., Chen, X., Zhang, Y., Ji, H., & Zhang, D. (2025). SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation Signals.
Chicago Style (17th ed.) CitationHan, Peixuan, Cheng Qian, Xiusi Chen, Yuji Zhang, Heng Ji, and Denghui Zhang. SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation Signals. 2025.
MLA (9th ed.) CitationHan, Peixuan, et al. SafeSwitch: Steering Unsafe LLM Behavior via Internal Activation Signals. 2025.
Warning: These citations may not always be 100% accurate.