:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cai, Ye, Yang, Zonglin, Ni, Liwei, Liu, Junfeng, Xie, Biwei, Li, Xingquan
Format:	Preprint
Published:	2024
Subjects:	Distributed, Parallel, and Cluster Computing
Online Access:	https://arxiv.org/abs/2404.13617
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Enhancing ASIC Technology Mapping via Parallel Supergate Computing
by: Cai, Ye, et al.
Published: (2024)

Efficient Parallel Execution of Blockchain Transactions Leveraging Conflict Specifications
by: Anjana, Parwat Singh, et al.
Published: (2025)

HP-MDR: High-performance and Portable Data Refactoring and Progressive Retrieval with Advanced GPUs
by: Li, Yanliang, et al.
Published: (2025)

FlexPipe: Adapting Dynamic LLM Serving Through Inflight Pipeline Refactoring in Fragmented Serverless Clusters
by: Lin, Yanying, et al.
Published: (2025)

Optimizing Long-context LLM Serving via Fine-grained Sequence Parallelism
by: Li, Cong, et al.
Published: (2025)

Cold-Start Anti-Patterns and Refactorings in Serverless Systems: An Empirical Study
by: Tariq, Syed Salauddin Mohammad, et al.
Published: (2025)

Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization
by: Li, Haoyang, et al.
Published: (2024)

Accelerating Microswimmer Simulations via a Heterogeneous Pipelined Parallel-in-Time Framework
by: Huang, Ruixiang, et al.
Published: (2026)

Linear Complexity $\mathcal{H}^2$ Direct Solver for Fine-Grained Parallel Architectures
by: Boukaram, Wajih, et al.
Published: (2025)

DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism
by: Jiang, Chenyu, et al.
Published: (2025)

Oases: Efficient Large-Scale Model Training on Commodity Servers via Overlapped and Automated Tensor Model Parallelism
by: Li, Shengwei, et al.
Published: (2023)

NanoCP: Request-Level Dynamic Context Parallelism for Data-Expert Parallel Decoding
by: Chen, Jiefei, et al.
Published: (2026)

Advances in Semantic Patching for HPC-oriented Refactorings with Coccinelle
by: Martone, Michele, et al.
Published: (2025)

Balancing Pipeline Parallelism with Vocabulary Parallelism
by: Yeung, Man Tsung, et al.
Published: (2024)

Unleashing Scalable Context Parallelism for Foundation Models Pre-Training via FCP
by: Zhao, Yilong, et al.
Published: (2026)

Optimizing View Change for Byzantine Fault Tolerance in Parallel Consensus
by: Xie, Yifei, et al.
Published: (2026)

S-HPLB: Efficient LLM Attention Serving via Sparsity-Aware Head Parallelism Load Balance
by: Liu, Di, et al.
Published: (2026)

Communication-Computation Pipeline Parallel Split Learning over Wireless Edge Networks
by: Liu, Chenyu, et al.
Published: (2025)

HYDRA: Breaking the Global Ordering Barrier in Multi-BFT Consensus
by: Lyu, Hanzheng, et al.
Published: (2025)

Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices
by: Shen, Tao, et al.
Published: (2025)

Maximizing Blockchain Performance: Mitigating Conflicting Transactions through Parallelism and Dependency Management
by: Bappy, Faisal Haque, et al.
Published: (2024)

ResiHP: Taming LLM Training Failures with Dynamic Hybrid Parallelism
by: Ma, Tenghui, et al.
Published: (2026)

Accelerating Heterogeneous Tensor Parallelism via Flexible Workload Control
by: Wang, Zhigang, et al.
Published: (2024)

Synergistic Tensor and Pipeline Parallelism
by: Qi, Mengshi, et al.
Published: (2025)

ZeroPP: Unleashing Exceptional Parallelism Efficiency through Tensor-Parallelism-Free Methodology
by: Tang, Ding, et al.
Published: (2024)

Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core Parallelism
by: Wei, Jinhui, et al.
Published: (2025)

Committee Configuration Optimization for Parallel Byzantine Consensus in a Trusted Execution Environment
by: Xie, Yifei, et al.
Published: (2026)

Enhancing Memory Efficiency in Large Language Model Training Through Chronos-aware Pipeline Parallelism
by: Lin, Xinyuan, et al.
Published: (2025)

HAP: Hybrid Adaptive Parallelism for Efficient Mixture-of-Experts Inference
by: Lin, Haoran, et al.
Published: (2025)

Parallel Collaborative ADMM Privacy Computing and Adaptive GPU Acceleration for Distributed Edge Networks
by: Xia, Mengchun, et al.
Published: (2026)

SPPO:Efficient Long-sequence LLM Training via Adaptive Sequence Pipeline Parallel Offloading
by: Chen, Qiaoling, et al.
Published: (2025)

MoEntwine: Unleashing the Potential of Wafer-scale Chips for Large-scale Expert Parallel Inference
by: Tang, Xinru, et al.
Published: (2025)

Pending Conflicts Make Progress Impossible
by: Kuznetsov, Petr, et al.
Published: (2026)

Communication-Efficient Model Aggregation with Layer Divergence Feedback in Federated Learning
by: Wang, Liwei, et al.
Published: (2024)

A Flexible Programmable Pipeline Parallelism Framework for Efficient DNN Training
by: Jiang, Lijuan, et al.
Published: (2025)

Surviving Partial Rank Failures in Wide Expert-Parallel MoE Inference
by: Sun, Xun, et al.
Published: (2026)

Hecate: Unlocking Efficient Sparse Model Training via Fully Sharded Sparse Data Parallelism
by: Qing, Yuhao, et al.
Published: (2025)

High-Performance N-Queens Solver on GPU: Iterative DFS with Zero Bank Conflicts
by: Yao, Guangchao, et al.
Published: (2025)

cuFastTuckerPlus: A Stochastic Parallel Sparse FastTucker Decomposition Using GPU Tensor Cores
by: Li, Zixuan, et al.
Published: (2024)

Hyperion: Hierarchical Scheduling for Parallel LLM Acceleration in Multi-tier Networks
by: Ma, Mulei, et al.
Published: (2025)