:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Zuyu, Lv, Bin
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Distributed, Parallel, and Cluster Computing
Online Access:	https://arxiv.org/abs/2503.23289
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Graph Neural Networks as Ordering Heuristics for Parallel Graph Coloring
by: Langedal, Kenneth, et al.
Published: (2024)

RadixMLP -- Intra-batch Deduplication for Causal Transformers
by: Feil, Michael, et al.
Published: (2026)

GSplit: Scaling Graph Neural Network Training on Large Graphs via Split-Parallelism
by: Polisetty, Sandeep, et al.
Published: (2023)

DHP: Efficient Scaling of MLLM Training with Dynamic Hybrid Parallelism
by: Niu, Yifan, et al.
Published: (2026)

Rel-HNN: Split Parallel Hypergraph Neural Network for Learning on Relational Databases
by: Alam, Md. Tanvir, et al.
Published: (2025)

DHO$_2$: Accelerating Distributed Hybrid Order Optimization via Model Parallelism and ADMM
by: Gu, Shunxian, et al.
Published: (2025)

BLoad: Enhancing Neural Network Training with Efficient Sequential Data Handling
by: Ruschel, Raphael, et al.
Published: (2023)

TASP: Topology-aware Sequence Parallelism
by: Wang, Yida, et al.
Published: (2025)

Ariel-ML: Computing Parallelization with Embedded Rust for Neural Networks on Heterogeneous Multi-core Microcontrollers
by: Huang, Zhaolan, et al.
Published: (2025)

Grappa: Gradient-Only Communication for Scalable Graph Neural Network Training
by: Xu, Chongyang, et al.
Published: (2026)

FlexSP: Accelerating Large Language Model Training via Flexible Sequence Parallelism
by: Wang, Yujie, et al.
Published: (2024)

A Parallel Alternative for Energy-Efficient Neural Network Training and Inferencing
by: Seal, Sudip K., et al.
Published: (2025)

Heterogeneous Parallelism for Multimodal Large Language Model Training
by: Karnati, Yashaswi, et al.
Published: (2026)

On Optimizing the Communication of Model Parallelism
by: Zhuang, Yonghao, et al.
Published: (2022)

SuperFedNAS: Cost-Efficient Federated Neural Architecture Search for On-Device Inference
by: Khare, Alind, et al.
Published: (2023)

HelixPipe: Efficient Distributed Training of Long Sequence Transformers with Attention Parallel Pipeline Parallelism
by: Zhang, Geng, et al.
Published: (2025)

PipeLive: Efficient Live In-place Pipeline Parallelism Reconfiguration for Dynamic LLM Serving
by: Bai, Xu, et al.
Published: (2026)

Hydraulis: Balancing Large Transformer Model Training via Co-designing Parallel Strategies and Data Assignment
by: Li, Haoyang, et al.
Published: (2024)

TAPAS: Fast and Automatic Derivation of Tensor Parallel Strategies for Large Neural Networks
by: Shi, Ziji, et al.
Published: (2023)

A Readiness-Driven Runtime for Pipeline-Parallel Training under Runtime Variability
by: Liu, Ruitao, et al.
Published: (2026)

Asynchronous Evolution of Deep Neural Network Architectures
by: Liang, Jason, et al.
Published: (2023)

Edge-Parallel Graph Encoder Embedding
by: Lubonja, Ariel, et al.
Published: (2024)

Federated Learning on Stochastic Neural Networks
by: Tang, Jingqiao, et al.
Published: (2025)

Graph Neural Networks Gone Hogwild
by: Solodova, Olga, et al.
Published: (2024)

Cooperative Minibatching in Graph Neural Networks
by: Balin, Muhammed Fatih, et al.
Published: (2023)

Fully Distributed Online Training of Graph Neural Networks in Networked Systems
by: Olshevskyi, Rostyslav, et al.
Published: (2024)

Nesterov Method for Asynchronous Pipeline Parallel Optimization
by: Ajanthan, Thalaiyasingam, et al.
Published: (2025)

MP-SL: Multihop Parallel Split Learning
by: Tirana, Joana, et al.
Published: (2024)

MoE Parallel Folding: Heterogeneous Parallelism Mappings for Efficient Large-Scale MoE Model Training with Megatron Core
by: Liu, Dennis, et al.
Published: (2025)

Two-dimensional Sparse Parallelism for Large Scale Deep Learning Recommendation Model Training
by: Zhang, Xin, et al.
Published: (2025)

ShardTensor: Domain Parallelism for Scientific Machine Learning
by: Adams, Corey, et al.
Published: (2026)

PiPar: Pipeline Parallelism for Collaborative Machine Learning
by: Zhang, Zihan, et al.
Published: (2022)

DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers
by: Zhao, Xuanlei, et al.
Published: (2024)

PaSE: Parallelization Strategies for Efficient DNN Training
by: Elango, Venmugil
Published: (2024)

Heterogeneous Federated Learning with Convolutional and Spiking Neural Networks
by: Yu, Yingchao, et al.
Published: (2024)

Robust Fully-Asynchronous Methods for Distributed Training over General Architecture
by: Zhu, Zehan, et al.
Published: (2023)

Improving Automatic Parallel Training via Balanced Memory Workload Optimization
by: Wang, Yujie, et al.
Published: (2023)

Parallelization of the K-Means Algorithm with Applications to Big Data Clustering
by: Srivastava, Ashish, et al.
Published: (2024)

Efficient Parallelization Layouts for Large-Scale Distributed Model Training
by: Hagemann, Johannes, et al.
Published: (2023)

Harnessing Increased Client Participation with Cohort-Parallel Federated Learning
by: Dhasade, Akash, et al.
Published: (2024)