Wang, J., Yao, Y., Kuang, W., Mao, R., Sun, Z., Tao, Z., . . . Zhang, K. (2025). OmniInfer: System-Wide Acceleration Techniques for Optimizing LLM Serving Throughput and Latency.
Chicago Style (17th ed.) CitationWang, Jun, et al. OmniInfer: System-Wide Acceleration Techniques for Optimizing LLM Serving Throughput and Latency. 2025.
MLA (9th ed.) CitationWang, Jun, et al. OmniInfer: System-Wide Acceleration Techniques for Optimizing LLM Serving Throughput and Latency. 2025.
Warning: These citations may not always be 100% accurate.