Wang, Y., Zhuo, Z., Zeng, Y., Zhou, X., Yang, J., & Li, X. (2025). Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models.
Chicago Style (17th ed.) CitationWang, Ya, Zhijian Zhuo, Yutao Zeng, Xun Zhou, Jian Yang, and Xiaoqing Li. Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models. 2025.
MLA (9th ed.) CitationWang, Ya, et al. Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models. 2025.
Warning: These citations may not always be 100% accurate.