APA (7th ed.) Citation

Gao, B., He, Z., Sharma, P., Kang, Q., Jevdjic, D., Deng, J., . . . Zuo, P. (2024). Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention.

Chicago Style (17th ed.) Citation

Gao, Bin, Zhuomin He, Puru Sharma, Qingxuan Kang, Djordje Jevdjic, Junbo Deng, Xingkun Yang, Zhou Yu, and Pengfei Zuo. Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention. 2024.

MLA (9th ed.) Citation

Gao, Bin, et al. Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention. 2024.

Warning: These citations may not always be 100% accurate.