APA (7th ed.) Citation

Wang, Z., Cui, B., & Gan, S. (2024). SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget.

Chicago Style (17th ed.) Citation

Wang, Zihao, Bin Cui, and Shaoduo Gan. SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget. 2024.

MLA (9th ed.) Citation

Wang, Zihao, et al. SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budget. 2024.

Warning: These citations may not always be 100% accurate.