Saved in:
| Main Authors: | Cai, Ye, Yang, Zonglin, Ni, Liwei, Xie, Biwei, Li, Xingquan |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.13614 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Parallel AIG Refactoring via Conflict Breaking
by: Cai, Ye, et al.
Published: (2024)
by: Cai, Ye, et al.
Published: (2024)
Lectures on Parallel Computing
by: Träff, Jesper Larsson
Published: (2024)
by: Träff, Jesper Larsson
Published: (2024)
Mapping Gemma3 onto an Edge Dataflow Architecture
by: Du, Shouyu, et al.
Published: (2026)
by: Du, Shouyu, et al.
Published: (2026)
Parallel Collaborative ADMM Privacy Computing and Adaptive GPU Acceleration for Distributed Edge Networks
by: Xia, Mengchun, et al.
Published: (2026)
by: Xia, Mengchun, et al.
Published: (2026)
Communication-Computation Pipeline Parallel Split Learning over Wireless Edge Networks
by: Liu, Chenyu, et al.
Published: (2025)
by: Liu, Chenyu, et al.
Published: (2025)
Optimizing Long-context LLM Serving via Fine-grained Sequence Parallelism
by: Li, Cong, et al.
Published: (2025)
by: Li, Cong, et al.
Published: (2025)
Neutron particle transport 3D method of characteristic Multi GPU platform Parallel Computing
by: Zhou, Faguo, et al.
Published: (2025)
by: Zhou, Faguo, et al.
Published: (2025)
SparseMap: Loop Mapping for Sparse CNNs on Streaming Coarse-grained Reconfigurable Array
by: Ni, Xiaobing, et al.
Published: (2024)
by: Ni, Xiaobing, et al.
Published: (2024)
Enhancing Memory Efficiency in Large Language Model Training Through Chronos-aware Pipeline Parallelism
by: Lin, Xinyuan, et al.
Published: (2025)
by: Lin, Xinyuan, et al.
Published: (2025)
Resource-efficient Parallel Split Learning in Heterogeneous Edge Computing
by: Zhang, Mingjin, et al.
Published: (2024)
by: Zhang, Mingjin, et al.
Published: (2024)
What Every Computer Scientist Needs To Know About Parallelization
by: Adefemi, Temitayo
Published: (2025)
by: Adefemi, Temitayo
Published: (2025)
Deferred Objects to Enhance Smart Contract Programming with Optimistic Parallel Execution
by: Mitenkov, George, et al.
Published: (2024)
by: Mitenkov, George, et al.
Published: (2024)
EinDecomp: Decomposition of Declaratively-Specified Machine Learning and Numerical Computations for Parallel Execution
by: Bourgeois, Daniel, et al.
Published: (2024)
by: Bourgeois, Daniel, et al.
Published: (2024)
Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization
by: Li, Haoyang, et al.
Published: (2024)
by: Li, Haoyang, et al.
Published: (2024)
GPU-Based Parallel Computing Methods for Medical Photoacoustic Image Reconstruction
by: Yi, Xinyao, et al.
Published: (2024)
by: Yi, Xinyao, et al.
Published: (2024)
Minimizing Communication for Parallel Symmetric Tensor Times Same Vector Computation
by: Daas, Hussam Al, et al.
Published: (2025)
by: Daas, Hussam Al, et al.
Published: (2025)
Mapping Parallel Matrix Multiplication in GotoBLAS2 to the AMD Versal ACAP for Deep Learning
by: Lei, Jie, et al.
Published: (2024)
by: Lei, Jie, et al.
Published: (2024)
DCP: Addressing Input Dynamism In Long-Context Training via Dynamic Context Parallelism
by: Jiang, Chenyu, et al.
Published: (2025)
by: Jiang, Chenyu, et al.
Published: (2025)
A Unified Approach to Concurrent, Parallel Map-Reduce in R using Futures
by: Bengtsson, Henrik
Published: (2026)
by: Bengtsson, Henrik
Published: (2026)
Balancing Pipeline Parallelism with Vocabulary Parallelism
by: Yeung, Man Tsung, et al.
Published: (2024)
by: Yeung, Man Tsung, et al.
Published: (2024)
Unleashing Scalable Context Parallelism for Foundation Models Pre-Training via FCP
by: Zhao, Yilong, et al.
Published: (2026)
by: Zhao, Yilong, et al.
Published: (2026)
Optimizing View Change for Byzantine Fault Tolerance in Parallel Consensus
by: Xie, Yifei, et al.
Published: (2026)
by: Xie, Yifei, et al.
Published: (2026)
Parallel Reduced Order Modeling for Digital Twins using High-Performance Computing Workflows
by: de Parga, S. Ares, et al.
Published: (2024)
by: de Parga, S. Ares, et al.
Published: (2024)
Mining Area Skyline Objects from Map-based Big Data using Apache Spark Framework
by: Li, Chen, et al.
Published: (2024)
by: Li, Chen, et al.
Published: (2024)
Optimizing Task Scheduling in Heterogeneous Computing Environments: A Comparative Analysis of CPU, GPU, and ASIC Platforms Using E2C Simulator
by: Mohammadjafari, Ali, et al.
Published: (2024)
by: Mohammadjafari, Ali, et al.
Published: (2024)
ResiHP: Taming LLM Training Failures with Dynamic Hybrid Parallelism
by: Ma, Tenghui, et al.
Published: (2026)
by: Ma, Tenghui, et al.
Published: (2026)
Accelerating Heterogeneous Tensor Parallelism via Flexible Workload Control
by: Wang, Zhigang, et al.
Published: (2024)
by: Wang, Zhigang, et al.
Published: (2024)
FedRFQ: Prototype-Based Federated Learning with Reduced Redundancy, Minimal Failure, and Enhanced Quality
by: Yan, Biwei, et al.
Published: (2024)
by: Yan, Biwei, et al.
Published: (2024)
Synergistic Tensor and Pipeline Parallelism
by: Qi, Mengshi, et al.
Published: (2025)
by: Qi, Mengshi, et al.
Published: (2025)
ZeroPP: Unleashing Exceptional Parallelism Efficiency through Tensor-Parallelism-Free Methodology
by: Tang, Ding, et al.
Published: (2024)
by: Tang, Ding, et al.
Published: (2024)
Ghidorah: Fast LLM Inference on Edge with Speculative Decoding and Hetero-Core Parallelism
by: Wei, Jinhui, et al.
Published: (2025)
by: Wei, Jinhui, et al.
Published: (2025)
Committee Configuration Optimization for Parallel Byzantine Consensus in a Trusted Execution Environment
by: Xie, Yifei, et al.
Published: (2026)
by: Xie, Yifei, et al.
Published: (2026)
SPPO:Efficient Long-sequence LLM Training via Adaptive Sequence Pipeline Parallel Offloading
by: Chen, Qiaoling, et al.
Published: (2025)
by: Chen, Qiaoling, et al.
Published: (2025)
NanoCP: Request-Level Dynamic Context Parallelism for Data-Expert Parallel Decoding
by: Chen, Jiefei, et al.
Published: (2026)
by: Chen, Jiefei, et al.
Published: (2026)
Communication-Efficient Model Aggregation with Layer Divergence Feedback in Federated Learning
by: Wang, Liwei, et al.
Published: (2024)
by: Wang, Liwei, et al.
Published: (2024)
A Flexible Programmable Pipeline Parallelism Framework for Efficient DNN Training
by: Jiang, Lijuan, et al.
Published: (2025)
by: Jiang, Lijuan, et al.
Published: (2025)
Linear Complexity $\mathcal{H}^2$ Direct Solver for Fine-Grained Parallel Architectures
by: Boukaram, Wajih, et al.
Published: (2025)
by: Boukaram, Wajih, et al.
Published: (2025)
Driving Computational Efficiency in Large-Scale Platforms using HPC Technologies
by: Mendez, Alexander Martinez, et al.
Published: (2026)
by: Mendez, Alexander Martinez, et al.
Published: (2026)
A Unified Programming Model for Heterogeneous Computing with CPU and Accelerator Technologies
by: Xiong, Yuqing
Published: (2022)
by: Xiong, Yuqing
Published: (2022)
Oases: Efficient Large-Scale Model Training on Commodity Servers via Overlapped and Automated Tensor Model Parallelism
by: Li, Shengwei, et al.
Published: (2023)
by: Li, Shengwei, et al.
Published: (2023)
Similar Items
-
Parallel AIG Refactoring via Conflict Breaking
by: Cai, Ye, et al.
Published: (2024) -
Lectures on Parallel Computing
by: Träff, Jesper Larsson
Published: (2024) -
Mapping Gemma3 onto an Edge Dataflow Architecture
by: Du, Shouyu, et al.
Published: (2026) -
Parallel Collaborative ADMM Privacy Computing and Adaptive GPU Acceleration for Distributed Edge Networks
by: Xia, Mengchun, et al.
Published: (2026) -
Communication-Computation Pipeline Parallel Split Learning over Wireless Edge Networks
by: Liu, Chenyu, et al.
Published: (2025)