Li, W., Jiang, G., Ding, X., Tao, Z., Hao, C., Xu, C., . . . Wang, H. (2025). FlowKV: A Disaggregated Inference Framework with Low-Latency KV Cache Transfer and Load-Aware Scheduling.
Chicago Style (17th ed.) CitationLi, Weiqing, Guochao Jiang, Xiangyong Ding, Zhangcheng Tao, Chuzhan Hao, Chenfeng Xu, Yuewei Zhang, and Hao Wang. FlowKV: A Disaggregated Inference Framework with Low-Latency KV Cache Transfer and Load-Aware Scheduling. 2025.
MLA (9th ed.) CitationLi, Weiqing, et al. FlowKV: A Disaggregated Inference Framework with Low-Latency KV Cache Transfer and Load-Aware Scheduling. 2025.
Warning: These citations may not always be 100% accurate.