Wen, Q., Huang, Z., Meng, X., He, W., & Li, C. (2026). Mixture-of-Top-k Attention: Efficient Attention via Scalable Fast Weights.
Chicago Style (17th ed.) CitationWen, Qishuai, Zhiyuan Huang, Xianghan Meng, Wei He, and Chun-Guang Li. Mixture-of-Top-k Attention: Efficient Attention via Scalable Fast Weights. 2026.
MLA (9th ed.) CitationWen, Qishuai, et al. Mixture-of-Top-k Attention: Efficient Attention via Scalable Fast Weights. 2026.
Warning: These citations may not always be 100% accurate.