Qi, X., Chen, M., Xiao, W., Ye, J., He, Y., Li, C., & Lin, Z. (2025). DNT: A Deeply Normalized Transformer that can be trained by Momentum SGD.
Style de citation Chicago (17e éd.)Qi, Xianbiao, Marco Chen, Wenjie Xiao, Jiaquan Ye, Yelin He, Chun-Guang Li, et Zhouchen Lin. DNT: A Deeply Normalized Transformer That Can Be Trained by Momentum SGD. 2025.
Style de citation MLA (9e éd.)Qi, Xianbiao, et al. DNT: A Deeply Normalized Transformer That Can Be Trained by Momentum SGD. 2025.
Attention : ces citations peuvent ne pas être correctes à 100%.