:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Junming, Zhang, Yusen, Zhang, Rongchao, Zhu, Wenkai, Wu, Tian
Format:	Preprint
Published:	2026
Subjects:	Distributed, Parallel, and Cluster Computing Artificial Intelligence
Online Access:	https://arxiv.org/abs/2603.05957
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Conflict-Free Replicated Data Types for Neural Network Model Merging: A Two-Layer Architecture Enabling CRDT-Compliant Model Merging Across 26 Strategies
by: Gillespie, Ryan
Published: (2026)

MergePipe: A Budget-Aware Parameter Management System for Scalable LLM Merging
by: Wang, Yuanyi, et al.
Published: (2026)

Nightjar: Dynamic Adaptive Speculative Decoding for Large Language Models Serving
by: Li, Rui, et al.
Published: (2025)

Adaptive Fault Tolerance Mechanisms of Large Language Models in Cloud Computing Environments
by: Jin, Yihong, et al.
Published: (2025)

A-IO: Adaptive Inference Orchestration for Memory-Bound NPUs
by: Zhang, Chen, et al.
Published: (2026)

Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models
by: Chen, Daoyuan, et al.
Published: (2024)

Profiling-Driven Adaptive Distributed Transformer Inference on Embedded Edge Deployment
by: Qazi, Muhammad Azlan, et al.
Published: (2026)

ExpertFlow: Adaptive Expert Scheduling and Memory Coordination for Efficient MoE Inference
by: Shen, Zixu, et al.
Published: (2025)

Trade-offs in Decentralized Agentic AI Discovery Across the Compute Continuum
by: Dazzi, Patrizio, et al.
Published: (2026)

PacTrain: Pruning and Adaptive Sparse Gradient Compression for Efficient Collective Communication in Distributed Deep Learning
by: Wang, Yisu, et al.
Published: (2025)

Accelerating Long-Tail Generation in Synchronous RLHF Training via Adaptive Tensor Parallelism
by: Zhao, Long, et al.
Published: (2026)

AdaPtis: Reducing Pipeline Bubbles with Adaptive Pipeline Parallelism on Heterogeneous Models
by: Guo, Jihu, et al.
Published: (2025)

ScaleAcross Explorer: Exploring Communication Optimization for Scale-Across AI Model Training
by: Li, Minghao, et al.
Published: (2026)

OrchMLLM: Orchestrate Multimodal Data with Batch Post-Balancing to Accelerate Multimodal Large Language Model Training
by: Zheng, Yijie, et al.
Published: (2025)

Deploying Foundation Model Powered Agent Services: A Survey
by: Xu, Wenchao, et al.
Published: (2024)

AIBrix: Towards Scalable, Cost-Effective Large Language Model Inference Infrastructure
by: The AIBrix Team, et al.
Published: (2025)

SkyServe: Serving AI Models across Regions and Clouds with Spot Instances
by: Mao, Ziming, et al.
Published: (2024)

Multi-IaC-Eval: Benchmarking Cloud Infrastructure as Code Across Multiple Formats
by: Davidson, Sam, et al.
Published: (2025)

SGDPO: Self-Guided Direct Preference Optimization for Language Model Alignment
by: Zhu, Wenqiao, et al.
Published: (2025)

Dynamic Resource Allocation for Virtual Machine Migration Optimization using Machine Learning
by: Gong, Yulu, et al.
Published: (2024)

Electricity Cost Minimization for Multi-Workflow Allocation in Geo-Distributed Data Centers
by: Wang, Shuang, et al.
Published: (2025)

Research on Model Parallelism and Data Parallelism Optimization Methods in Large Language Model-Based Recommendation Systems
by: Yang, Haowei, et al.
Published: (2025)

Equinox: Holistic Fair Scheduling in Serving Large Language Models
by: Wei, Zhixiang, et al.
Published: (2025)

CCL-D: A High-Precision Diagnostic System for Slow and Hang Anomalies in Large-Scale Model Training
by: Gu, Yida, et al.
Published: (2026)

KAITIAN: A Unified Communication Framework for Enabling Efficient Collaboration Across Heterogeneous Accelerators in Embodied AI Systems
by: Lin, Jieke, et al.
Published: (2025)

PlanetServe: A Decentralized, Scalable, and Privacy-Preserving Overlay for Democratizing Large Language Model Serving
by: Fang, Fei, et al.
Published: (2025)

SparOA: Sparse and Operator-aware Hybrid Scheduling for Edge DNN Inference
by: Zhang, Ziyang, et al.
Published: (2025)

NeurLZ: An Online Neural Learning-Based Method to Enhance Scientific Lossy Compression
by: Jia, Wenqi, et al.
Published: (2024)

WORKSWORLD: A Domain for Integrated Numeric Planning and Scheduling of Distributed Pipelined Workflows
by: Paul, Taylor, et al.
Published: (2026)

Mosaic: Data-Free Knowledge Distillation via Mixture-of-Experts for Heterogeneous Distributed Environments
by: Liu, Junming, et al.
Published: (2025)

Mind the Boundary: Stabilizing Gemini Enterprise A2A via a Cloud Run Hub Across Projects and Accounts
by: Morita, Takao
Published: (2026)

TimelyFreeze: Adaptive Parameter Freezing Mechanism for Pipeline Parallelism
by: Cho, Seonghye, et al.
Published: (2026)

An Upload-Efficient Scheme for Transferring Knowledge From a Server-Side Pre-trained Generator to Clients in Heterogeneous Federated Learning
by: Zhang, Jianqing, et al.
Published: (2024)

Adaptive AI-based Decentralized Resource Management in the Cloud-Edge Continuum
by: Li, Lanpei, et al.
Published: (2025)

Para-B&B: Load-Balanced Deterministic Parallelization of Solving MIP
by: Zhang, Jinyu, et al.
Published: (2026)

Verify Distributed Deep Learning Model Implementation Refinement with Iterative Relation Inference
by: Wang, Zhanghan, et al.
Published: (2025)

PRAGMA: A Profiling-Reasoned Multi-Agent Framework for Automatic Kernel Optimization
by: Lei, Kelun, et al.
Published: (2025)

Deep Reinforcement Learning for Fault-Adaptive Routing in Eisenstein-Jacobi Interconnection Topologies
by: Charrwi, Mohammad Walid, et al.
Published: (2026)

StreamServe: Adaptive Speculative Flows for Low-Latency Disaggregated LLM Serving
by: Kumar, Satyam, et al.
Published: (2026)

Enhancing Communication Efficiency in FL with Adaptive Gradient Quantization and Communication Frequency Optimization
by: Tariq, Asadullah, et al.
Published: (2025)