Zhao, J., Huang, J., Wu, Z., Bau, D., & Shi, W. (2025). LLMs Encode Harmfulness and Refusal Separately.
Chicago Style (17th ed.) CitationZhao, Jiachen, Jing Huang, Zhengxuan Wu, David Bau, and Weiyan Shi. LLMs Encode Harmfulness and Refusal Separately. 2025.
MLA (9th ed.) CitationZhao, Jiachen, et al. LLMs Encode Harmfulness and Refusal Separately. 2025.
Warning: These citations may not always be 100% accurate.