Saved in:
| Main Authors: | Cao, Zhiwei, Li, Minghao, Lin, Feng, Jia, Jimin, Wen, Yonggang, Yin, Jianxiong, See, Simon |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.04982 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SPPO:Efficient Long-sequence LLM Training via Adaptive Sequence Pipeline Parallel Offloading
by: Chen, Qiaoling, et al.
Published: (2025)
by: Chen, Qiaoling, et al.
Published: (2025)
CoFormer: Collaborating with Heterogeneous Edge Devices for Scalable Transformer Inference
by: Xu, Guanyu, et al.
Published: (2025)
by: Xu, Guanyu, et al.
Published: (2025)
FUSCO: High-Performance Distributed Data Shuffling via Transformation-Communication Fusion
by: Zhu, Zhuoran, et al.
Published: (2025)
by: Zhu, Zhuoran, et al.
Published: (2025)
PromptTuner: SLO-Aware Elastic System for LLM Prompt Tuning
by: Gao, Wei, et al.
Published: (2026)
by: Gao, Wei, et al.
Published: (2026)
Low-Latency Layer-Aware Proactive and Passive Container Migration in Meta Computing
by: Liu, Mengjie, et al.
Published: (2024)
by: Liu, Mengjie, et al.
Published: (2024)
Adaptive Configuration Selection for Multi-Model Inference Pipelines in Edge Computing
by: Sheng, Jinhao, et al.
Published: (2025)
by: Sheng, Jinhao, et al.
Published: (2025)
A Systematic Literature Review on Task Allocation and Performance Management Techniques in Cloud Data Center
by: Chauhan, Nidhika, et al.
Published: (2024)
by: Chauhan, Nidhika, et al.
Published: (2024)
LRScheduler: A Layer-aware and Resource-adaptive Container Scheduler in Edge Computing
by: Tang, Zhiqing, et al.
Published: (2025)
by: Tang, Zhiqing, et al.
Published: (2025)
Advancing Environmental Sustainability in Data Centers via Carbon Depreciation Models
by: Ji, Shixin, et al.
Published: (2024)
by: Ji, Shixin, et al.
Published: (2024)
Hetis: Serving LLMs in Heterogeneous GPU Clusters with Fine-grained and Dynamic Parallelism
by: Mo, Zizhao, et al.
Published: (2025)
by: Mo, Zizhao, et al.
Published: (2025)
Sustainable Grid through Distributed Data Centers: Spinning AI Demand for Grid Stabilization and Optimization
by: Evans, Scott C, et al.
Published: (2025)
by: Evans, Scott C, et al.
Published: (2025)
Xorbits: Automating Operator Tiling for Distributed Data Science
by: Lu, Weizheng, et al.
Published: (2023)
by: Lu, Weizheng, et al.
Published: (2023)
On the Performance and Memory Footprint of Distributed Training: An Empirical Study on Transformers
by: Lu, Zhengxian, et al.
Published: (2024)
by: Lu, Zhengxian, et al.
Published: (2024)
CONCUR: High-Throughput Agentic Batch Inference of LLM via Congestion-Based Concurrency Control
by: Chen, Qiaoling, et al.
Published: (2026)
by: Chen, Qiaoling, et al.
Published: (2026)
Humas: A Heterogeneity- and Upgrade-aware Microservice Auto-scaling Framework in Large-scale Data Centers
by: Hua, Qin, et al.
Published: (2024)
by: Hua, Qin, et al.
Published: (2024)
Hyperion: Low-Latency Ultra-HD Video Analytics via Collaborative Vision Transformer Inference
by: Jiang, Linyi, et al.
Published: (2025)
by: Jiang, Linyi, et al.
Published: (2025)
InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding
by: Chen, Qiaoling, et al.
Published: (2024)
by: Chen, Qiaoling, et al.
Published: (2024)
Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations
by: Boito, Francieli, et al.
Published: (2024)
by: Boito, Francieli, et al.
Published: (2024)
Choosing the Right Battery Model for Data Center Simulations
by: Kilian, Paul, et al.
Published: (2025)
by: Kilian, Paul, et al.
Published: (2025)
Eventually-Consistent Federated Scheduling for Data Center Workloads
by: Thiyyakat, Meghana, et al.
Published: (2023)
by: Thiyyakat, Meghana, et al.
Published: (2023)
DFedSat: Communication-Efficient and Robust Decentralized Federated Learning for LEO Satellite Constellations
by: Yang, Minghao, et al.
Published: (2024)
by: Yang, Minghao, et al.
Published: (2024)
Edge AI: A Taxonomy, Systematic Review and Future Directions
by: Gill, Sukhpal Singh, et al.
Published: (2024)
by: Gill, Sukhpal Singh, et al.
Published: (2024)
Adaptive Management of Microservices in Dynamic Computing Environments: A Taxonomy and Future Directions
by: Chen, Ming, et al.
Published: (2026)
by: Chen, Ming, et al.
Published: (2026)
Accelerating AIGC Services with Latent Action Diffusion Scheduling in Edge Networks
by: Xu, Changfu, et al.
Published: (2024)
by: Xu, Changfu, et al.
Published: (2024)
DataCenterGym: A Physics-Grounded Simulator for Multi-Objective Data Center Scheduling
by: Pathak, Nilavra, et al.
Published: (2026)
by: Pathak, Nilavra, et al.
Published: (2026)
A Fast Task Offloading Optimization Framework for IRS-Assisted Multi-Access Edge Computing System
by: Wu, Jianqiu, et al.
Published: (2023)
by: Wu, Jianqiu, et al.
Published: (2023)
Data Management System Analysis for Distributed Computing Workloads
by: Hsu, Kuan-Chieh, et al.
Published: (2025)
by: Hsu, Kuan-Chieh, et al.
Published: (2025)
DynaFlow: Transparent and Flexible Intra-Device Parallelism via Programmable Operator Scheduling
by: Pan, Yi, et al.
Published: (2026)
by: Pan, Yi, et al.
Published: (2026)
Buffer Centering for bittide Synchronization via Frame Rotation
by: Lall, Sanjay, et al.
Published: (2025)
by: Lall, Sanjay, et al.
Published: (2025)
Exploring the Efficiency of Renewable Energy-based Modular Data Centers at Scale
by: Sun, Jinghan, et al.
Published: (2024)
by: Sun, Jinghan, et al.
Published: (2024)
Environmentally-Conscious Cloud Orchestration Considering Geo-Distributed Data Centers
by: Attenni, Giulio, et al.
Published: (2025)
by: Attenni, Giulio, et al.
Published: (2025)
Anywhere: A Web Crawler Automation Management Interface
by: Lin, Jinwei
Published: (2024)
by: Lin, Jinwei
Published: (2024)
FlashFuser: Expanding the Scale of Kernel Fusion for Compute-Intensive Operators via Inter-Core Connection
by: Huang, Ziyu, et al.
Published: (2025)
by: Huang, Ziyu, et al.
Published: (2025)
FATE: Future-State-Aware Scheduling for Heterogeneous LLM Workflows
by: Huang, Zirui, et al.
Published: (2026)
by: Huang, Zirui, et al.
Published: (2026)
Managing, Analyzing and Sharing Research Data with Gen3 Data Commons
by: Barnes, Craig, et al.
Published: (2025)
by: Barnes, Craig, et al.
Published: (2025)
Semantic-Aware Scheduling for GPU Clusters with Large Language Models
by: Wang, Zerui, et al.
Published: (2025)
by: Wang, Zerui, et al.
Published: (2025)
DCSim: Computing and Networking Integration based Container Scheduling Simulator for Data Centers
by: Hu, Jinlong, et al.
Published: (2024)
by: Hu, Jinlong, et al.
Published: (2024)
A Taxonomy of Schedulers -- Operating Systems, Clusters and Big Data Frameworks
by: Sliwko, Leszek
Published: (2025)
by: Sliwko, Leszek
Published: (2025)
From Data Center IoT Telemetry to Data Analytics Chatbots -- Virtual Knowledge Graph is All You Need
by: Khan, Junaid Ahmed, et al.
Published: (2025)
by: Khan, Junaid Ahmed, et al.
Published: (2025)
Data Version Management and Machine-Actionable Reproducibility for HPC
by: Knüpfer, Andreas, et al.
Published: (2025)
by: Knüpfer, Andreas, et al.
Published: (2025)
Similar Items
-
SPPO:Efficient Long-sequence LLM Training via Adaptive Sequence Pipeline Parallel Offloading
by: Chen, Qiaoling, et al.
Published: (2025) -
CoFormer: Collaborating with Heterogeneous Edge Devices for Scalable Transformer Inference
by: Xu, Guanyu, et al.
Published: (2025) -
FUSCO: High-Performance Distributed Data Shuffling via Transformation-Communication Fusion
by: Zhu, Zhuoran, et al.
Published: (2025) -
PromptTuner: SLO-Aware Elastic System for LLM Prompt Tuning
by: Gao, Wei, et al.
Published: (2026) -
Low-Latency Layer-Aware Proactive and Passive Container Migration in Meta Computing
by: Liu, Mengjie, et al.
Published: (2024)