Kong, C., Jang, J., & Kwak, N. (2025). Understanding Differential Transformer Unchains Pretrained Self-Attentions.
Chicago Style (17th ed.) CitationKong, Chaerin, Jiho Jang, and Nojun Kwak. Understanding Differential Transformer Unchains Pretrained Self-Attentions. 2025.
MLA (9th ed.) CitationKong, Chaerin, et al. Understanding Differential Transformer Unchains Pretrained Self-Attentions. 2025.
Warning: These citations may not always be 100% accurate.