Saved in:
| Main Authors: | Hu, Yi-Xiang, Wang, Yuke, Wu, Feng, Huang, Zirui, Zeng, Shuli, Li, Xiang-Yang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.06064 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FATE: Future-State-Aware Scheduling for Heterogeneous LLM Workflows
by: Huang, Zirui, et al.
Published: (2026)
by: Huang, Zirui, et al.
Published: (2026)
SLO-Aware Scheduling for Large Language Model Inferences
by: Huang, Jinqi, et al.
Published: (2025)
by: Huang, Jinqi, et al.
Published: (2025)
Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey
by: Liang, Feng, et al.
Published: (2024)
by: Liang, Feng, et al.
Published: (2024)
Scale: Deep Reinforcement Learning for Container Scheduling in Serverless Edge Computing
by: Chen, Chen, et al.
Published: (2026)
by: Chen, Chen, et al.
Published: (2026)
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library
by: Wang, Weixun, et al.
Published: (2025)
by: Wang, Weixun, et al.
Published: (2025)
PATCHEDSERVE: A Patch Management Framework for SLO-Optimized Hybrid Resolution Diffusion Serving
by: Sun, Desen, et al.
Published: (2025)
by: Sun, Desen, et al.
Published: (2025)
Characterization-Guided GPU Fault Resilience in NVIDIA MPS
by: Liu, Rixin, et al.
Published: (2026)
by: Liu, Rixin, et al.
Published: (2026)
A Reinforcement Learning-Driven Task Scheduling Algorithm for Multi-Tenant Distributed Systems
by: Zhang, Xiaopei, et al.
Published: (2025)
by: Zhang, Xiaopei, et al.
Published: (2025)
HydraInfer: Hybrid Disaggregated Scheduling for Multimodal Large Language Model Serving
by: Dong, Xianzhe, et al.
Published: (2025)
by: Dong, Xianzhe, et al.
Published: (2025)
MSARS: A Meta-Learning and Reinforcement Learning Framework for SLO Resource Allocation and Adaptive Scaling for Microservices
by: Hu, Kan, et al.
Published: (2024)
by: Hu, Kan, et al.
Published: (2024)
MRSch: Multi-Resource Scheduling for HPC
by: Li, Boyang, et al.
Published: (2024)
by: Li, Boyang, et al.
Published: (2024)
FLAME: A Serving System Optimized for Large-Scale Generative Recommendation with Efficiency
by: Guo, Xianwen, et al.
Published: (2025)
by: Guo, Xianwen, et al.
Published: (2025)
Deep Reinforcement Learning-based Methods for Resource Scheduling in Cloud Computing: A Review and Future Directions
by: Zhou, Guangyao, et al.
Published: (2021)
by: Zhou, Guangyao, et al.
Published: (2021)
iDDS: Intelligent Distributed Dispatch and Scheduling for Workflow Orchestration
by: Guan, Wen, et al.
Published: (2025)
by: Guan, Wen, et al.
Published: (2025)
Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges
by: Stavrinides, Georgios L., et al.
Published: (2025)
by: Stavrinides, Georgios L., et al.
Published: (2025)
Workflow-Driven Modeling for the Compute Continuum: An Optimization Approach to Automated System and Workload Scheduling
by: Sharma, Aasish Kumar, et al.
Published: (2025)
by: Sharma, Aasish Kumar, et al.
Published: (2025)
A Deep Reinforcement Learning Approach for Cost Optimized Workflow Scheduling in Cloud Computing Environments
by: Jayanetti, Amanda, et al.
Published: (2024)
by: Jayanetti, Amanda, et al.
Published: (2024)
λScale: Enabling Fast Scaling for Serverless Large Language Model Inference
by: Yu, Minchen, et al.
Published: (2025)
by: Yu, Minchen, et al.
Published: (2025)
LRScheduler: A Layer-aware and Resource-adaptive Container Scheduler in Edge Computing
by: Tang, Zhiqing, et al.
Published: (2025)
by: Tang, Zhiqing, et al.
Published: (2025)
Reinforcement Learning for Adaptive Resource Scheduling in Complex System Environments
by: Li, Pochun, et al.
Published: (2024)
by: Li, Pochun, et al.
Published: (2024)
Scalable HPC Job Scheduling and Resource Management in SST
by: Abdurahman, Abubeker, et al.
Published: (2025)
by: Abdurahman, Abubeker, et al.
Published: (2025)
A HPC Co-Scheduler with Reinforcement Learning
by: Souza, Abel, et al.
Published: (2024)
by: Souza, Abel, et al.
Published: (2024)
Collaborative Multi-Agent Reinforcement Learning Approach for Elastic Cloud Resource Scaling
by: Fang, Bruce, et al.
Published: (2025)
by: Fang, Bruce, et al.
Published: (2025)
Dynamic Scheduling Strategies for Resource Optimization in Computing Environments
by: Wang, Xiaoye
Published: (2024)
by: Wang, Xiaoye
Published: (2024)
An Efficient and Adaptive Watermark Detection System with Tile-based Error Correction
by: Zhong, Xinrui, et al.
Published: (2025)
by: Zhong, Xinrui, et al.
Published: (2025)
Cortex: Workflow-Aware Resource Pooling and Scheduling for Agentic Serving
by: Pagonas, Nikos, et al.
Published: (2025)
by: Pagonas, Nikos, et al.
Published: (2025)
Arrow: Adaptive Scheduling Mechanisms for Disaggregated LLM Inference Architecture
by: Wu, Yu, et al.
Published: (2025)
by: Wu, Yu, et al.
Published: (2025)
A Survey on Model-heterogeneous Federated Learning: Problems, Methods, and Prospects
by: Fan, Boyu, et al.
Published: (2023)
by: Fan, Boyu, et al.
Published: (2023)
Diagonal Scaling: A Multi-Dimensional Resource Model and Optimization Framework for Distributed Databases
by: Abdullah, Shahir, et al.
Published: (2025)
by: Abdullah, Shahir, et al.
Published: (2025)
RingAda: Pipelining Large Model Fine-Tuning on Edge Devices with Scheduled Layer Unfreezing
by: Li, Liang, et al.
Published: (2025)
by: Li, Liang, et al.
Published: (2025)
FREESH: Fair, Resource- and Energy-Efficient Scheduling for LLM Serving on Heterogeneous GPUs
by: He, Xuan, et al.
Published: (2025)
by: He, Xuan, et al.
Published: (2025)
INSPIRIT: Optimizing Heterogeneous Task Scheduling through Adaptive Priority in Task-based Runtime Systems
by: Wang, Yiqing, et al.
Published: (2024)
by: Wang, Yiqing, et al.
Published: (2024)
cuSZ-$i$: High-Ratio Scientific Lossy Compression on GPUs with Optimized Multi-Level Interpolation
by: Liu, Jinyang, et al.
Published: (2023)
by: Liu, Jinyang, et al.
Published: (2023)
Enhancing Kubernetes Automated Scheduling with Deep Learning and Reinforcement Techniques for Large-Scale Cloud Computing Optimization
by: Xu, Zheng, et al.
Published: (2024)
by: Xu, Zheng, et al.
Published: (2024)
SLICE: SLO-Driven Scheduling for LLM Inference on Edge Computing Devices
by: Chow, Will
Published: (2025)
by: Chow, Will
Published: (2025)
Dynamic Service Scheduling and Resource Management in Energy-Harvesting Multi-access Edge Computing
by: Chen, Shuyi, et al.
Published: (2025)
by: Chen, Shuyi, et al.
Published: (2025)
Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices
by: Shen, Tao, et al.
Published: (2025)
by: Shen, Tao, et al.
Published: (2025)
Data-Driven Analysis to Understand GPU Hardware Resource Usage of Optimizations
by: Islam, Tanzima Z., et al.
Published: (2024)
by: Islam, Tanzima Z., et al.
Published: (2024)
Optimal Fixed Priority Scheduling in Multi-Stage Multi-Resource Distributed Real-Time Systems
by: Kumar, Niraj, et al.
Published: (2024)
by: Kumar, Niraj, et al.
Published: (2024)
Scheduling Deep Learning Jobs in Multi-Tenant GPU Clusters via Wise Resource Sharing
by: Luo, Yizhou, et al.
Published: (2024)
by: Luo, Yizhou, et al.
Published: (2024)
Similar Items
-
FATE: Future-State-Aware Scheduling for Heterogeneous LLM Workflows
by: Huang, Zirui, et al.
Published: (2026) -
SLO-Aware Scheduling for Large Language Model Inferences
by: Huang, Jinqi, et al.
Published: (2025) -
Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey
by: Liang, Feng, et al.
Published: (2024) -
Scale: Deep Reinforcement Learning for Container Scheduling in Serverless Edge Computing
by: Chen, Chen, et al.
Published: (2026) -
Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library
by: Wang, Weixun, et al.
Published: (2025)