Saved in:
| Main Authors: | Patwardhan, Ishan, Gandhi, Shubham, Khare, Om, Joshi, Amit, Sawant, Suraj |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.15628 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Hybrid Quantum-HPC Solutions for Max-Cut: Bridging Classical and Quantum Algorithms
by: Patwardhan, Ishan, et al.
Published: (2024)
by: Patwardhan, Ishan, et al.
Published: (2024)
PolyKV: A Shared Asymmetrically-Compressed KV Cache Pool for Multi-Agent LLM Inference
by: Patel, Ishan, et al.
Published: (2026)
by: Patel, Ishan, et al.
Published: (2026)
Sparse Checkpointing for Fast and Reliable MoE Training
by: Gandhi, Swapnil, et al.
Published: (2024)
by: Gandhi, Swapnil, et al.
Published: (2024)
Distributed Locking: Performance Analysis and Optimization Strategies
by: Rodriguez, Andre, et al.
Published: (2025)
by: Rodriguez, Andre, et al.
Published: (2025)
KnapsackLB: Enabling Performance-Aware Layer-4 Load Balancing
by: Gandhi, Rohan, et al.
Published: (2024)
by: Gandhi, Rohan, et al.
Published: (2024)
Comparative Analysis of Distributed Caching Algorithms: Performance Metrics and Implementation Considerations
by: Mayer, Helen, et al.
Published: (2025)
by: Mayer, Helen, et al.
Published: (2025)
Comparative Analysis of Lightweight Kubernetes Distributions for Edge Computing: Performance and Resource Efficiency
by: Yakubov, Diyaz, et al.
Published: (2025)
by: Yakubov, Diyaz, et al.
Published: (2025)
Cost-Performance Analysis: A Comparative Study of CPU-Based Serverless and GPU-Based Training Architectures
by: Barrak, Amine, et al.
Published: (2025)
by: Barrak, Amine, et al.
Published: (2025)
ParEval-Repo: A Benchmark Suite for Evaluating LLMs with Repository-level HPC Translation Tasks
by: Davis, Joshua H., et al.
Published: (2025)
by: Davis, Joshua H., et al.
Published: (2025)
Efficient Distributed MLLM Training with Cornstarch
by: Jang, Insu, et al.
Published: (2025)
by: Jang, Insu, et al.
Published: (2025)
A flexible FPGA accelerator for convolutional neural networks
by: Majumder, Kingshuk, et al.
Published: (2019)
by: Majumder, Kingshuk, et al.
Published: (2019)
Characterizing FaaS Workflows on Public Clouds: The Good, the Bad and the Ugly
by: Kulkarni, Varad, et al.
Published: (2025)
by: Kulkarni, Varad, et al.
Published: (2025)
Fault-Tolerant Decentralized Distributed Asynchronous Federated Learning with Adaptive Termination Detection
by: Akkinepally, Phani Sahasra, et al.
Published: (2025)
by: Akkinepally, Phani Sahasra, et al.
Published: (2025)
Leveraging Hardware Performance Counters for Predicting Workload Interference in Vector Supercomputers
by: Shubham, et al.
Published: (2024)
by: Shubham, et al.
Published: (2024)
ReCycle: Resilient Training of Large DNNs using Pipeline Adaptation
by: Gandhi, Swapnil, et al.
Published: (2024)
by: Gandhi, Swapnil, et al.
Published: (2024)
FailSafe: High-performance Resilient Serving
by: Xu, Ziyi, et al.
Published: (2025)
by: Xu, Ziyi, et al.
Published: (2025)
Tetris: Efficient Intra-Datacenter Calls Packing for Large Conferencing Services
by: Gandhi, Rohan, et al.
Published: (2025)
by: Gandhi, Rohan, et al.
Published: (2025)
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
by: Duan, Jiangfei, et al.
Published: (2024)
by: Duan, Jiangfei, et al.
Published: (2024)
A Study on the Performance of Distributed Training of Data-driven CFD Simulations
by: Iserte, Sergio, et al.
Published: (2026)
by: Iserte, Sergio, et al.
Published: (2026)
Heta: Distributed Training of Heterogeneous Graph Neural Networks
by: Zhong, Yuchen, et al.
Published: (2024)
by: Zhong, Yuchen, et al.
Published: (2024)
Galvatron: Automatic Distributed Training for Large Transformer Models
by: Gumaan, Esmail
Published: (2025)
by: Gumaan, Esmail
Published: (2025)
Addressing Variable Heterogeneity in Distributed Multimodal Training with Entrain
by: Jang, Insu, et al.
Published: (2026)
by: Jang, Insu, et al.
Published: (2026)
Accelerating Distributed MoE Training and Inference with Lina
by: Li, Jiamin, et al.
Published: (2022)
by: Li, Jiamin, et al.
Published: (2022)
Optimizing Distributed Training Approaches for Scaling Neural Networks
by: Baligodugula, Vishnu Vardhan, et al.
Published: (2025)
by: Baligodugula, Vishnu Vardhan, et al.
Published: (2025)
A Comparative Analysis of Identifier Schemes: UUIDv4, UUIDv7, and ULID for Distributed Systems
by: Kakolaki, Nima Karimian
Published: (2025)
by: Kakolaki, Nima Karimian
Published: (2025)
On the Performance and Memory Footprint of Distributed Training: An Empirical Study on Transformers
by: Lu, Zhengxian, et al.
Published: (2024)
by: Lu, Zhengxian, et al.
Published: (2024)
MegatronApp: Efficient and Comprehensive Management on Distributed LLM Training
by: Zhao, Bohan, et al.
Published: (2025)
by: Zhao, Bohan, et al.
Published: (2025)
Sutradhara: An Intelligent Orchestrator-Engine Co-design for Tool-based Agentic Inference
by: Biswas, Anish, et al.
Published: (2026)
by: Biswas, Anish, et al.
Published: (2026)
DeepCompile: A Compiler-Driven Approach to Optimizing Distributed Deep Learning Training
by: Tanaka, Masahiro, et al.
Published: (2025)
by: Tanaka, Masahiro, et al.
Published: (2025)
A Survey of End-to-End Modeling for Distributed DNN Training: Workloads, Simulators, and TCO
by: Svedas, Jonas, et al.
Published: (2025)
by: Svedas, Jonas, et al.
Published: (2025)
An Experimental Comparison of Partitioning Strategies for Distributed Graph Neural Network Training
by: Merkel, Nikolai, et al.
Published: (2023)
by: Merkel, Nikolai, et al.
Published: (2023)
Fulcrum: Optimizing Concurrent DNN Training and Inferencing on Edge Accelerators
by: K., Prashanthi S., et al.
Published: (2025)
by: K., Prashanthi S., et al.
Published: (2025)
A Comparative Evaluation of Automated Analysis Tools for Solidity Smart Contracts
by: Wei, Zhiyuan, et al.
Published: (2023)
by: Wei, Zhiyuan, et al.
Published: (2023)
Nezha: Breaking Multi-Rail Network Barriers for Distributed DNN Training
by: Yu, Enda, et al.
Published: (2024)
by: Yu, Enda, et al.
Published: (2024)
Poplar: Efficient Scaling of Distributed DNN Training on Heterogeneous GPU Clusters
by: Zhang, WenZheng, et al.
Published: (2024)
by: Zhang, WenZheng, et al.
Published: (2024)
Lagom: Unleashing the Power of Communication and Computation Overlapping for Distributed LLM Training
by: Xu, Guanbin, et al.
Published: (2026)
by: Xu, Guanbin, et al.
Published: (2026)
DistFlow: A Fully Distributed RL Framework for Scalable and Efficient LLM Post-Training
by: Wang, Zhixin, et al.
Published: (2025)
by: Wang, Zhixin, et al.
Published: (2025)
PruneX: A Hierarchical Communication-Efficient System for Distributed CNN Training with Structured Pruning
by: Olama, Alireza, et al.
Published: (2025)
by: Olama, Alireza, et al.
Published: (2025)
FlowMoE: A Scalable Pipeline Scheduling Framework for Distributed Mixture-of-Experts Training
by: Gao, Yunqi, et al.
Published: (2025)
by: Gao, Yunqi, et al.
Published: (2025)
Analysis of Distributed Algorithms for Big-data
by: Purohit, Rajendra, et al.
Published: (2024)
by: Purohit, Rajendra, et al.
Published: (2024)
Similar Items
-
Hybrid Quantum-HPC Solutions for Max-Cut: Bridging Classical and Quantum Algorithms
by: Patwardhan, Ishan, et al.
Published: (2024) -
PolyKV: A Shared Asymmetrically-Compressed KV Cache Pool for Multi-Agent LLM Inference
by: Patel, Ishan, et al.
Published: (2026) -
Sparse Checkpointing for Fast and Reliable MoE Training
by: Gandhi, Swapnil, et al.
Published: (2024) -
Distributed Locking: Performance Analysis and Optimization Strategies
by: Rodriguez, Andre, et al.
Published: (2025) -
KnapsackLB: Enabling Performance-Aware Layer-4 Load Balancing
by: Gandhi, Rohan, et al.
Published: (2024)