Saved in:
| Main Author: | Wangni, Jianqiao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.07513 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training
by: Liu, Xinyi, et al.
Published: (2025)
by: Liu, Xinyi, et al.
Published: (2025)
Laminar: A Scalable Asynchronous RL Post-Training Framework
by: Sheng, Guangming, et al.
Published: (2025)
by: Sheng, Guangming, et al.
Published: (2025)
Scalable and Adaptive Parallel Training of Graph Transformer on Large Graphs
by: Lin, Jun-Liang, et al.
Published: (2026)
by: Lin, Jun-Liang, et al.
Published: (2026)
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers
by: Singh, Siddharth, et al.
Published: (2025)
by: Singh, Siddharth, et al.
Published: (2025)
Guard: Scalable Straggler Detection and Node Health Management for Large-Scale Training
by: Liu, Guanliang, et al.
Published: (2026)
by: Liu, Guanliang, et al.
Published: (2026)
GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism
by: Jeon, Byungsoo, et al.
Published: (2024)
by: Jeon, Byungsoo, et al.
Published: (2024)
Communication-free Sampling and 4D Hybrid Parallelism for Scalable Mini-batch GNN Training
by: Wei, Cunyang, et al.
Published: (2026)
by: Wei, Cunyang, et al.
Published: (2026)
Enhancing Data Quality in Federated Fine-Tuning of Foundation Models
by: Zhao, Wanru, et al.
Published: (2024)
by: Zhao, Wanru, et al.
Published: (2024)
A Survey of Resource-efficient LLM and Multimodal Foundation Models
by: Xu, Mengwei, et al.
Published: (2024)
by: Xu, Mengwei, et al.
Published: (2024)
Fine-Tuning GPT-5 for GPU Kernel Generation
by: Tehrani, Ali, et al.
Published: (2026)
by: Tehrani, Ali, et al.
Published: (2026)
Scalable Pretraining of Large Mixture of Experts Language Models on Aurora Super Computer
by: Vooturi, Dharma Teja, et al.
Published: (2026)
by: Vooturi, Dharma Teja, et al.
Published: (2026)
GPU Memory Prediction for Multimodal Model Training
by: Jeong, Jinwoo, et al.
Published: (2025)
by: Jeong, Jinwoo, et al.
Published: (2025)
DP2FL: Dual Prompt Personalized Federated Learning in Foundation Models
by: Chang, Ying, et al.
Published: (2025)
by: Chang, Ying, et al.
Published: (2025)
When Foundation Model Meets Federated Learning: Motivations, Challenges, and Future Directions
by: Zhuang, Weiming, et al.
Published: (2023)
by: Zhuang, Weiming, et al.
Published: (2023)
MQ-GNN: A Multi-Queue Pipelined Architecture for Scalable and Efficient GNN Training
by: Ullah, Irfan, et al.
Published: (2026)
by: Ullah, Irfan, et al.
Published: (2026)
Efficient and Scalable Agentic AI with Heterogeneous Systems
by: Asgar, Zain, et al.
Published: (2025)
by: Asgar, Zain, et al.
Published: (2025)
Context Parallelism for Scalable Million-Token Inference
by: Yang, Amy, et al.
Published: (2024)
by: Yang, Amy, et al.
Published: (2024)
Scalable Artificial Intelligence for Science: Perspectives, Methods and Exemplars
by: Brewer, Wesley, et al.
Published: (2024)
by: Brewer, Wesley, et al.
Published: (2024)
TrainVerify: Equivalence-Based Verification for Distributed LLM Training
by: Lu, Yunchi, et al.
Published: (2025)
by: Lu, Yunchi, et al.
Published: (2025)
PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization
by: Wan, Xinyi, et al.
Published: (2025)
by: Wan, Xinyi, et al.
Published: (2025)
Hubs and Spokes Learning: Efficient and Scalable Collaborative Machine Learning
by: Sharma, Atul, et al.
Published: (2025)
by: Sharma, Atul, et al.
Published: (2025)
The Big Send-off: Scalable and Performant Collectives for Deep Learning
by: Singh, Siddharth, et al.
Published: (2025)
by: Singh, Siddharth, et al.
Published: (2025)
DistShap: Scalable GNN Explanations with Distributed Shapley Values
by: Akkas, Selahattin, et al.
Published: (2025)
by: Akkas, Selahattin, et al.
Published: (2025)
DPQuant: Efficient and Differentially-Private Model Training via Dynamic Quantization Scheduling
by: Gao, Yubo, et al.
Published: (2025)
by: Gao, Yubo, et al.
Published: (2025)
BitPipe: Bidirectional Interleaved Pipeline Parallelism for Accelerating Large Models Training
by: Wu, Houming, et al.
Published: (2024)
by: Wu, Houming, et al.
Published: (2024)
FedComLoc: Communication-Efficient Distributed Training of Sparse and Quantized Models
by: Yi, Kai, et al.
Published: (2024)
by: Yi, Kai, et al.
Published: (2024)
Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer
by: Yao, Jinghan, et al.
Published: (2024)
by: Yao, Jinghan, et al.
Published: (2024)
EASTER: Embedding Aggregation-based Heterogeneous Models Training in Vertical Federated Learning
by: Wang, Shuo, et al.
Published: (2023)
by: Wang, Shuo, et al.
Published: (2023)
Training Heterogeneous Client Models using Knowledge Distillation in Serverless Federated Learning
by: Chadha, Mohak, et al.
Published: (2024)
by: Chadha, Mohak, et al.
Published: (2024)
Intelligent Sampling of Extreme-Scale Turbulence Datasets for Accurate and Efficient Spatiotemporal Model Training
by: Brewer, Wesley, et al.
Published: (2025)
by: Brewer, Wesley, et al.
Published: (2025)
FSD-Inference: Fully Serverless Distributed Inference with Scalable Cloud Communication
by: Oakley, Joe, et al.
Published: (2024)
by: Oakley, Joe, et al.
Published: (2024)
AccidentGPT: Large Multi-Modal Foundation Model for Traffic Accident Analysis
by: Wu, Kebin, et al.
Published: (2024)
by: Wu, Kebin, et al.
Published: (2024)
TawPipe: Topology-Aware Weight Pipeline Parallelism for Accelerating Long-Context Large Models Training
by: Wu, Houming, et al.
Published: (2025)
by: Wu, Houming, et al.
Published: (2025)
Federated Learning with Workload Reduction through Partial Training of Client Models and Entropy-Based Data Selection
by: Shi, Hongrui, et al.
Published: (2024)
by: Shi, Hongrui, et al.
Published: (2024)
Piper: Efficient Large-Scale MoE Training via Resource Modeling and Pipelined Hybrid Parallelism
by: Dash, Sajal, et al.
Published: (2026)
by: Dash, Sajal, et al.
Published: (2026)
Post-Deterministic Distributed Systems: A New Foundation for Trustworthy Autonomous Infrastructure
by: He, Jun, et al.
Published: (2026)
by: He, Jun, et al.
Published: (2026)
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
by: Chen, Yanxi, et al.
Published: (2023)
by: Chen, Yanxi, et al.
Published: (2023)
FedPBS: Proximal-Balanced Scaling Federated Learning Model for Robust Personalized Training for Non-IID Data
by: AbouNassar, Eman M., et al.
Published: (2026)
by: AbouNassar, Eman M., et al.
Published: (2026)
SFPrompt: Communication-Efficient Split Federated Fine-Tuning for Large Pre-Trained Models over Resource-Limited Devices
by: Cao, Linxiao, et al.
Published: (2024)
by: Cao, Linxiao, et al.
Published: (2024)
Robust LLM Training Infrastructure at ByteDance
by: Wan, Borui, et al.
Published: (2025)
by: Wan, Borui, et al.
Published: (2025)
Similar Items
-
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training
by: Liu, Xinyi, et al.
Published: (2025) -
Laminar: A Scalable Asynchronous RL Post-Training Framework
by: Sheng, Guangming, et al.
Published: (2025) -
Scalable and Adaptive Parallel Training of Graph Transformer on Large Graphs
by: Lin, Jun-Liang, et al.
Published: (2026) -
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers
by: Singh, Siddharth, et al.
Published: (2025) -
Guard: Scalable Straggler Detection and Node Health Management for Large-Scale Training
by: Liu, Guanliang, et al.
Published: (2026)