Saved in:
| Main Authors: | Chung, Jae-Won, Liang, Zhirui, Mao, Yanyong, Chen, Jiasi, Chowdhury, Mosharaf, Dvorkin, Vladimir |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.05519 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Toward Cross-Layer Energy Optimizations in AI Systems
by: Chung, Jae-Won, et al.
Published: (2024)
by: Chung, Jae-Won, et al.
Published: (2024)
Kareus: Joint Reduction of Dynamic and Static Energy in Large Model Training
by: Wu, Ruofan, et al.
Published: (2026)
by: Wu, Ruofan, et al.
Published: (2026)
Where Do the Joules Go? Diagnosing Inference Energy Consumption
by: Chung, Jae-Won, et al.
Published: (2026)
by: Chung, Jae-Won, et al.
Published: (2026)
Andes: Defining and Enhancing Quality-of-Experience in LLM-Based Text Streaming Services
by: Liu, Jiachen, et al.
Published: (2024)
by: Liu, Jiachen, et al.
Published: (2024)
Reducing Energy Bloat in Large Model Training
by: Chung, Jae-Won, et al.
Published: (2023)
by: Chung, Jae-Won, et al.
Published: (2023)
Addressing Variable Heterogeneity in Distributed Multimodal Training with Entrain
by: Jang, Insu, et al.
Published: (2026)
by: Jang, Insu, et al.
Published: (2026)
Cornserve: A Distributed Serving System for Any-to-Any Multimodal Models
by: Chung, Jae-Won, et al.
Published: (2026)
by: Chung, Jae-Won, et al.
Published: (2026)
Cornfigurator: Automated Planning for Any-to-Any Multimodal Model Serving
by: Ma, Jeff J., et al.
Published: (2025)
by: Ma, Jeff J., et al.
Published: (2025)
Coordinated Cooling and Compute Management for AI Datacenters
by: Abera, Nardos Belay, et al.
Published: (2026)
by: Abera, Nardos Belay, et al.
Published: (2026)
KAIROS: Stateful, Context-Aware Power-Efficient Agentic Inference Serving
by: Yuan, Yichao, et al.
Published: (2026)
by: Yuan, Yichao, et al.
Published: (2026)
Efficient Distributed MLLM Training with Cornstarch
by: Jang, Insu, et al.
Published: (2025)
by: Jang, Insu, et al.
Published: (2025)
OpenDC-STEAM: Realistic Modeling and Systematic Exploration of Composable Techniques for Sustainable Datacenters
by: Niewenhuis, Dante, et al.
Published: (2026)
by: Niewenhuis, Dante, et al.
Published: (2026)
OpenDT: Exploring Datacenter Performance and Sustainability with a Self-Calibrating Digital Twin
by: Nicolae, Radu, et al.
Published: (2026)
by: Nicolae, Radu, et al.
Published: (2026)
Tetris: Efficient Intra-Datacenter Calls Packing for Large Conferencing Services
by: Gandhi, Rohan, et al.
Published: (2025)
by: Gandhi, Rohan, et al.
Published: (2025)
QAOA in Quantum Datacenters: Parallelization, Simulation, and Orchestration
by: Liaqat, Amana, et al.
Published: (2025)
by: Liaqat, Amana, et al.
Published: (2025)
Datacenter Energy Optimized Power Profiles
by: Narayanaswamy, Sreedhar, et al.
Published: (2025)
by: Narayanaswamy, Sreedhar, et al.
Published: (2025)
Venn: Resource Management for Collaborative Learning Jobs
by: Liu, Jiachen, et al.
Published: (2023)
by: Liu, Jiachen, et al.
Published: (2023)
Serving Compound Inference Systems on Datacenter GPUs
by: Devata, Sriram, et al.
Published: (2026)
by: Devata, Sriram, et al.
Published: (2026)
M3SA: Exploring Datacenter Performance and Climate-Impact with Multi- and Meta-Model Simulation and Analysis
by: Nicolae, Radu, et al.
Published: (2026)
by: Nicolae, Radu, et al.
Published: (2026)
An AI-Native Runtime for Multi-Wearable Environments
by: Min, Chulhong, et al.
Published: (2024)
by: Min, Chulhong, et al.
Published: (2024)
Designing Datacenter Power Delivery Hierarchies for the AI Era
by: Wilkins, Grant, et al.
Published: (2026)
by: Wilkins, Grant, et al.
Published: (2026)
FedTrans: Efficient Federated Learning via Multi-Model Transformation
by: Zhu, Yuxuan, et al.
Published: (2024)
by: Zhu, Yuxuan, et al.
Published: (2024)
Capsule: Efficient Player Isolation for Datacenters
by: Du, Zhouheng, et al.
Published: (2025)
by: Du, Zhouheng, et al.
Published: (2025)
GICC: A High-Performance Runtime for GPU-Initiated Communication and Coordination in Modern HPC Systems
by: Shan, Baodi, et al.
Published: (2026)
by: Shan, Baodi, et al.
Published: (2026)
CrossPipe: Towards Optimal Pipeline Schedules for Cross-Datacenter Training
by: Chen, Tiancheng, et al.
Published: (2025)
by: Chen, Tiancheng, et al.
Published: (2025)
Uncertainty-Aware Decarbonization for Datacenters
by: Li, Amy, et al.
Published: (2024)
by: Li, Amy, et al.
Published: (2024)
6G EdgeAI: Performance Evaluation and Analysis
by: Yang, Chien-Sheng, et al.
Published: (2025)
by: Yang, Chien-Sheng, et al.
Published: (2025)
6G Infrastructures for Edge AI: An Analytical Perspective
by: Horvath, Kurt, et al.
Published: (2025)
by: Horvath, Kurt, et al.
Published: (2025)
The Ghost in the Datacenter: Link Flapping, Topology Knowledge Failures, and the FITO Category Mistake
by: Borrill, Paul
Published: (2026)
by: Borrill, Paul
Published: (2026)
PowerTrip: Exploiting Federated Heterogeneous Datacenter Power for Distributed ML Training
by: Mehboob, Talha, et al.
Published: (2025)
by: Mehboob, Talha, et al.
Published: (2025)
Power Stabilization for AI Training Datacenters
by: Choukse, Esha, et al.
Published: (2025)
by: Choukse, Esha, et al.
Published: (2025)
Modeling the Impact of Fiber Latency on Compute-Communication Overlap in Geo-Distributed Multi-Datacenter AI Training
by: Papavasileiou, Ioannis, et al.
Published: (2026)
by: Papavasileiou, Ioannis, et al.
Published: (2026)
Prefill-as-a-Service: KVCache of Next-Generation Models Could Go Cross-Datacenter
by: Qin, Ruoyu, et al.
Published: (2026)
by: Qin, Ruoyu, et al.
Published: (2026)
DCGen 1.1 Technical Report: Generating Datacenter Configurations (including IT, Power, Cooling)
by: Gnibga, Wedan Emmanuel, et al.
Published: (2026)
by: Gnibga, Wedan Emmanuel, et al.
Published: (2026)
Integrating and Characterizing HPC Task Runtime Systems for hybrid AI-HPC workloads
by: Merzky, Andre, et al.
Published: (2025)
by: Merzky, Andre, et al.
Published: (2025)
Distribution and Management of Datacenter Load Decoupling
by: Lin, Liuzixuan, et al.
Published: (2025)
by: Lin, Liuzixuan, et al.
Published: (2025)
Mercury: QoS-Aware Tiered Memory System
by: Lu, Jiaheng, et al.
Published: (2024)
by: Lu, Jiaheng, et al.
Published: (2024)
Coordinating GPU Data Centers and Power Grid Regulation Service for Exogenous Carbon Benefits
by: Jahanshahi, Ali, et al.
Published: (2026)
by: Jahanshahi, Ali, et al.
Published: (2026)
Hydra: Virtualized Multi-Language Runtime for High-Density Serverless Platforms
by: Ivanenko, Serhii, et al.
Published: (2022)
by: Ivanenko, Serhii, et al.
Published: (2022)
Characterization of Large Language Model Development in the Datacenter
by: Hu, Qinghao, et al.
Published: (2024)
by: Hu, Qinghao, et al.
Published: (2024)
Similar Items
-
Toward Cross-Layer Energy Optimizations in AI Systems
by: Chung, Jae-Won, et al.
Published: (2024) -
Kareus: Joint Reduction of Dynamic and Static Energy in Large Model Training
by: Wu, Ruofan, et al.
Published: (2026) -
Where Do the Joules Go? Diagnosing Inference Energy Consumption
by: Chung, Jae-Won, et al.
Published: (2026) -
Andes: Defining and Enhancing Quality-of-Experience in LLM-Based Text Streaming Services
by: Liu, Jiachen, et al.
Published: (2024) -
Reducing Energy Bloat in Large Model Training
by: Chung, Jae-Won, et al.
Published: (2023)