Li, Q., Zhang, B., Ye, L., Zhang, Y., Wu, W., Sun, Y., . . . Xie, Y. (2024). Flash Communication: Reducing Tensor Parallelization Bottleneck for Fast Large Language Model Inference.
Chicago Style (17th ed.) CitationLi, Qingyuan, Bo Zhang, Liang Ye, Yifan Zhang, Wei Wu, Yerui Sun, Lin Ma, and Yuchen Xie. Flash Communication: Reducing Tensor Parallelization Bottleneck for Fast Large Language Model Inference. 2024.
MLA (9th ed.) CitationLi, Qingyuan, et al. Flash Communication: Reducing Tensor Parallelization Bottleneck for Fast Large Language Model Inference. 2024.
Warning: These citations may not always be 100% accurate.