Chelba, C., Chen, M., Bapna, A., & Shazeer, N. (2020). Faster Transformer Decoding: N-gram Masked Self-Attention.
Chicago Style (17th ed.) CitationChelba, Ciprian, Mia Chen, Ankur Bapna, and Noam Shazeer. Faster Transformer Decoding: N-gram Masked Self-Attention. 2020.
MLA (9th ed.) CitationChelba, Ciprian, et al. Faster Transformer Decoding: N-gram Masked Self-Attention. 2020.
Warning: These citations may not always be 100% accurate.