Levine, R., Sharma, R., Jain, N., Ramesh, A., Chen, Z., Abbas, N., . . . Sorensen, T. (2026). Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU.
Chicago Style (17th ed.) CitationLevine, Reese, Rithik Sharma, Nikhil Jain, Abhijit Ramesh, Zheyuan Chen, Neha Abbas, James Contini, and Tyler Sorensen. Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU. 2026.
MLA (9th ed.) CitationLevine, Reese, et al. Llamas on the Web: Memory-Efficient, Performance-Portable, and Multi-Precision LLM Inference with WebGPU. 2026.
Warning: These citations may not always be 100% accurate.