Saved in:
| Main Authors: | Liu, Junming, Zhang, Yusen, Zhang, Rongchao, Zhu, Wenkai, Wu, Tian |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.05957 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Conflict-Free Replicated Data Types for Neural Network Model Merging: A Two-Layer Architecture Enabling CRDT-Compliant Model Merging Across 26 Strategies
by: Gillespie, Ryan
Published: (2026)
by: Gillespie, Ryan
Published: (2026)
MergePipe: A Budget-Aware Parameter Management System for Scalable LLM Merging
by: Wang, Yuanyi, et al.
Published: (2026)
by: Wang, Yuanyi, et al.
Published: (2026)
Nightjar: Dynamic Adaptive Speculative Decoding for Large Language Models Serving
by: Li, Rui, et al.
Published: (2025)
by: Li, Rui, et al.
Published: (2025)
Adaptive Fault Tolerance Mechanisms of Large Language Models in Cloud Computing Environments
by: Jin, Yihong, et al.
Published: (2025)
by: Jin, Yihong, et al.
Published: (2025)
A-IO: Adaptive Inference Orchestration for Memory-Bound NPUs
by: Zhang, Chen, et al.
Published: (2026)
by: Zhang, Chen, et al.
Published: (2026)
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models
by: Chen, Daoyuan, et al.
Published: (2024)
by: Chen, Daoyuan, et al.
Published: (2024)
Profiling-Driven Adaptive Distributed Transformer Inference on Embedded Edge Deployment
by: Qazi, Muhammad Azlan, et al.
Published: (2026)
by: Qazi, Muhammad Azlan, et al.
Published: (2026)
ExpertFlow: Adaptive Expert Scheduling and Memory Coordination for Efficient MoE Inference
by: Shen, Zixu, et al.
Published: (2025)
by: Shen, Zixu, et al.
Published: (2025)
Trade-offs in Decentralized Agentic AI Discovery Across the Compute Continuum
by: Dazzi, Patrizio, et al.
Published: (2026)
by: Dazzi, Patrizio, et al.
Published: (2026)
PacTrain: Pruning and Adaptive Sparse Gradient Compression for Efficient Collective Communication in Distributed Deep Learning
by: Wang, Yisu, et al.
Published: (2025)
by: Wang, Yisu, et al.
Published: (2025)
Accelerating Long-Tail Generation in Synchronous RLHF Training via Adaptive Tensor Parallelism
by: Zhao, Long, et al.
Published: (2026)
by: Zhao, Long, et al.
Published: (2026)
AdaPtis: Reducing Pipeline Bubbles with Adaptive Pipeline Parallelism on Heterogeneous Models
by: Guo, Jihu, et al.
Published: (2025)
by: Guo, Jihu, et al.
Published: (2025)
ScaleAcross Explorer: Exploring Communication Optimization for Scale-Across AI Model Training
by: Li, Minghao, et al.
Published: (2026)
by: Li, Minghao, et al.
Published: (2026)
OrchMLLM: Orchestrate Multimodal Data with Batch Post-Balancing to Accelerate Multimodal Large Language Model Training
by: Zheng, Yijie, et al.
Published: (2025)
by: Zheng, Yijie, et al.
Published: (2025)
Deploying Foundation Model Powered Agent Services: A Survey
by: Xu, Wenchao, et al.
Published: (2024)
by: Xu, Wenchao, et al.
Published: (2024)
AIBrix: Towards Scalable, Cost-Effective Large Language Model Inference Infrastructure
by: The AIBrix Team, et al.
Published: (2025)
by: The AIBrix Team, et al.
Published: (2025)
SkyServe: Serving AI Models across Regions and Clouds with Spot Instances
by: Mao, Ziming, et al.
Published: (2024)
by: Mao, Ziming, et al.
Published: (2024)
Multi-IaC-Eval: Benchmarking Cloud Infrastructure as Code Across Multiple Formats
by: Davidson, Sam, et al.
Published: (2025)
by: Davidson, Sam, et al.
Published: (2025)
SGDPO: Self-Guided Direct Preference Optimization for Language Model Alignment
by: Zhu, Wenqiao, et al.
Published: (2025)
by: Zhu, Wenqiao, et al.
Published: (2025)
Dynamic Resource Allocation for Virtual Machine Migration Optimization using Machine Learning
by: Gong, Yulu, et al.
Published: (2024)
by: Gong, Yulu, et al.
Published: (2024)
Electricity Cost Minimization for Multi-Workflow Allocation in Geo-Distributed Data Centers
by: Wang, Shuang, et al.
Published: (2025)
by: Wang, Shuang, et al.
Published: (2025)
Research on Model Parallelism and Data Parallelism Optimization Methods in Large Language Model-Based Recommendation Systems
by: Yang, Haowei, et al.
Published: (2025)
by: Yang, Haowei, et al.
Published: (2025)
Equinox: Holistic Fair Scheduling in Serving Large Language Models
by: Wei, Zhixiang, et al.
Published: (2025)
by: Wei, Zhixiang, et al.
Published: (2025)
CCL-D: A High-Precision Diagnostic System for Slow and Hang Anomalies in Large-Scale Model Training
by: Gu, Yida, et al.
Published: (2026)
by: Gu, Yida, et al.
Published: (2026)
KAITIAN: A Unified Communication Framework for Enabling Efficient Collaboration Across Heterogeneous Accelerators in Embodied AI Systems
by: Lin, Jieke, et al.
Published: (2025)
by: Lin, Jieke, et al.
Published: (2025)
PlanetServe: A Decentralized, Scalable, and Privacy-Preserving Overlay for Democratizing Large Language Model Serving
by: Fang, Fei, et al.
Published: (2025)
by: Fang, Fei, et al.
Published: (2025)
SparOA: Sparse and Operator-aware Hybrid Scheduling for Edge DNN Inference
by: Zhang, Ziyang, et al.
Published: (2025)
by: Zhang, Ziyang, et al.
Published: (2025)
NeurLZ: An Online Neural Learning-Based Method to Enhance Scientific Lossy Compression
by: Jia, Wenqi, et al.
Published: (2024)
by: Jia, Wenqi, et al.
Published: (2024)
WORKSWORLD: A Domain for Integrated Numeric Planning and Scheduling of Distributed Pipelined Workflows
by: Paul, Taylor, et al.
Published: (2026)
by: Paul, Taylor, et al.
Published: (2026)
Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed Environments
by: Liu, Junming, et al.
Published: (2025)
by: Liu, Junming, et al.
Published: (2025)
Mind the Boundary: Stabilizing Gemini Enterprise A2A via a Cloud Run Hub Across Projects and Accounts
by: Morita, Takao
Published: (2026)
by: Morita, Takao
Published: (2026)
TimelyFreeze: Adaptive Parameter Freezing Mechanism for Pipeline Parallelism
by: Cho, Seonghye, et al.
Published: (2026)
by: Cho, Seonghye, et al.
Published: (2026)
An Upload-Efficient Scheme for Transferring Knowledge From a Server-Side Pre-trained Generator to Clients in Heterogeneous Federated Learning
by: Zhang, Jianqing, et al.
Published: (2024)
by: Zhang, Jianqing, et al.
Published: (2024)
Adaptive AI-based Decentralized Resource Management in the Cloud-Edge Continuum
by: Li, Lanpei, et al.
Published: (2025)
by: Li, Lanpei, et al.
Published: (2025)
Para-B&B: Load-Balanced Deterministic Parallelization of Solving MIP
by: Zhang, Jinyu, et al.
Published: (2026)
by: Zhang, Jinyu, et al.
Published: (2026)
Verify Distributed Deep Learning Model Implementation Refinement with Iterative Relation Inference
by: Wang, Zhanghan, et al.
Published: (2025)
by: Wang, Zhanghan, et al.
Published: (2025)
PRAGMA: A Profiling-Reasoned Multi-Agent Framework for Automatic Kernel Optimization
by: Lei, Kelun, et al.
Published: (2025)
by: Lei, Kelun, et al.
Published: (2025)
Deep Reinforcement Learning for Fault-Adaptive Routing in Eisenstein-Jacobi Interconnection Topologies
by: Charrwi, Mohammad Walid, et al.
Published: (2026)
by: Charrwi, Mohammad Walid, et al.
Published: (2026)
StreamServe: Adaptive Speculative Flows for Low-Latency Disaggregated LLM Serving
by: Kumar, Satyam, et al.
Published: (2026)
by: Kumar, Satyam, et al.
Published: (2026)
Enhancing Communication Efficiency in FL with Adaptive Gradient Quantization and Communication Frequency Optimization
by: Tariq, Asadullah, et al.
Published: (2025)
by: Tariq, Asadullah, et al.
Published: (2025)
Similar Items
-
Conflict-Free Replicated Data Types for Neural Network Model Merging: A Two-Layer Architecture Enabling CRDT-Compliant Model Merging Across 26 Strategies
by: Gillespie, Ryan
Published: (2026) -
MergePipe: A Budget-Aware Parameter Management System for Scalable LLM Merging
by: Wang, Yuanyi, et al.
Published: (2026) -
Nightjar: Dynamic Adaptive Speculative Decoding for Large Language Models Serving
by: Li, Rui, et al.
Published: (2025) -
Adaptive Fault Tolerance Mechanisms of Large Language Models in Cloud Computing Environments
by: Jin, Yihong, et al.
Published: (2025) -
A-IO: Adaptive Inference Orchestration for Memory-Bound NPUs
by: Zhang, Chen, et al.
Published: (2026)