Saved in:
| Main Authors: | He, Minghua, Zhang, Lingzhe, Liu, Yuan, Zhou, Xiao, Liu, Aiwei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.30851 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
by: Zhang, Lingzhe, et al.
Published: (2025)
by: Zhang, Lingzhe, et al.
Published: (2025)
Accelerating Diffusion LLMs via Adaptive Parallel Decoding
by: Israel, Daniel, et al.
Published: (2025)
by: Israel, Daniel, et al.
Published: (2025)
HD-MoE: Hybrid and Dynamic Parallelism for Mixture-of-Expert LLMs with 3D Near-Memory Processing
by: Huang, Haochen, et al.
Published: (2025)
by: Huang, Haochen, et al.
Published: (2025)
Spatiotemporal Analysis of Parallelized Computing at the Extreme Edge
by: Nabil, Yasser, et al.
Published: (2025)
by: Nabil, Yasser, et al.
Published: (2025)
EDAN: Towards Understanding Memory Parallelism and Latency Sensitivity in HPC
by: Shen, Siyuan, et al.
Published: (2025)
by: Shen, Siyuan, et al.
Published: (2025)
On Orchestrating Parallel Broadcasts for Distributed Ledgers
by: Sheng, Peiyao, et al.
Published: (2024)
by: Sheng, Peiyao, et al.
Published: (2024)
GigaAPI for GPU Parallelization
by: Suvarna, M., et al.
Published: (2025)
by: Suvarna, M., et al.
Published: (2025)
Robust Recursive Query Parallelism in Graph Database Management Systems
by: Chakraborty, Anurag, et al.
Published: (2025)
by: Chakraborty, Anurag, et al.
Published: (2025)
Fault-Tolerant Hybrid-Parallel Training at Scale with Reliable and Efficient In-memory Checkpointing
by: Wang, Yuxin, et al.
Published: (2023)
by: Wang, Yuxin, et al.
Published: (2023)
HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
by: Zhao, Xuanlei, et al.
Published: (2024)
by: Zhao, Xuanlei, et al.
Published: (2024)
PEVLM: Parallel Encoding for Vision-Language Models
by: Kang, Letian, et al.
Published: (2025)
by: Kang, Letian, et al.
Published: (2025)
Automated Programmatic Performance Analysis of Parallel Programs
by: Cankur, Onur, et al.
Published: (2024)
by: Cankur, Onur, et al.
Published: (2024)
Parallelizing a modern GPU simulator
by: Huerta, Rodrigo, et al.
Published: (2025)
by: Huerta, Rodrigo, et al.
Published: (2025)
Optimal Parallel Scheduling under Concave Speedup Functions
by: Li, Chengzhang, et al.
Published: (2025)
by: Li, Chengzhang, et al.
Published: (2025)
Recorder: Comprehensive Parallel I/O Tracing and Analysis
by: Wang, Chen, et al.
Published: (2025)
by: Wang, Chen, et al.
Published: (2025)
GPU-Accelerated Parallel Selected Inversion for Structured Matrices Using sTiles
by: Fattah, Esmail Abdul, et al.
Published: (2025)
by: Fattah, Esmail Abdul, et al.
Published: (2025)
Automated Calibration of Parallel and Distributed Computing Simulators: A Case Study
by: McDonald, Jesse, et al.
Published: (2024)
by: McDonald, Jesse, et al.
Published: (2024)
ParaLog: Consistent Host-side Logging for Parallel Checkpoints
by: Chien, Steven W. D., et al.
Published: (2024)
by: Chien, Steven W. D., et al.
Published: (2024)
Cache Blocking of Distributed-Memory Parallel Matrix Power Kernels
by: Lacey, Dane C., et al.
Published: (2024)
by: Lacey, Dane C., et al.
Published: (2024)
Large-Scale Data Parallelization of Product Quantization and Inverted Indexing Using Dask
by: Abraham, Ashley N., et al.
Published: (2026)
by: Abraham, Ashley N., et al.
Published: (2026)
Parallel Implementations Assessment of a Spatial-Spectral Classifier for Hyperspectral Clinical Applications
by: Lazcano, Raquel, et al.
Published: (2024)
by: Lazcano, Raquel, et al.
Published: (2024)
Comparing Parallel Functional Array Languages: Programming and Performance
by: van Balen, David, et al.
Published: (2025)
by: van Balen, David, et al.
Published: (2025)
ACALSim: A Scalable Parallel Simulation Framework for High-Performance System Design Space Exploration
by: Lin, Wei-Fen, et al.
Published: (2026)
by: Lin, Wei-Fen, et al.
Published: (2026)
Kino-PAX: Highly Parallel Kinodynamic Sampling-based Planner
by: Perrault, Nicolas, et al.
Published: (2024)
by: Perrault, Nicolas, et al.
Published: (2024)
Parallel $k$d-tree with Batch Updates
by: Men, Ziyang, et al.
Published: (2024)
by: Men, Ziyang, et al.
Published: (2024)
Fine-Grained Energy Prediction For Parallellized LLM Inference With PIE-P
by: Dutt, Anurag, et al.
Published: (2025)
by: Dutt, Anurag, et al.
Published: (2025)
Selective Parallel Loading of Large-Scale Compressed Graphs with ParaGrapher
by: Esfahani, Mohsen Koohi, et al.
Published: (2024)
by: Esfahani, Mohsen Koohi, et al.
Published: (2024)
Matryoshka: Optimization of Dynamic Diverse Quantum Chemistry Systems via Elastic Parallelism Transformation
by: Wang, Tuowei, et al.
Published: (2024)
by: Wang, Tuowei, et al.
Published: (2024)
Can Large Language Models Predict Parallel Code Performance?
by: Bolet, Gregory, et al.
Published: (2025)
by: Bolet, Gregory, et al.
Published: (2025)
An Efficient Hybrid Sparse Attention with CPU-GPU Parallelism for Long-Context Inference
by: Yao, Feiyu, et al.
Published: (2026)
by: Yao, Feiyu, et al.
Published: (2026)
ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution
by: Yang, Liu, et al.
Published: (2026)
by: Yang, Liu, et al.
Published: (2026)
Using UML State Diagrams for Modelling the Performance of Parallel Programs
by: Jorge Ortega Arjona
Published: (2008)
by: Jorge Ortega Arjona
Published: (2008)
CPMA: An Efficient Batch-Parallel Compressed Set Without Pointers
by: Wheatman, Brian, et al.
Published: (2023)
by: Wheatman, Brian, et al.
Published: (2023)
PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference
by: Fang, Jiarui, et al.
Published: (2024)
by: Fang, Jiarui, et al.
Published: (2024)
Binary Bleed: Fast Distributed and Parallel Method for Automatic Model Selection
by: Barron, Ryan, et al.
Published: (2024)
by: Barron, Ryan, et al.
Published: (2024)
DIAL: Decentralized I/O AutoTuning via Learned Client-side Local Metrics for Parallel File System
by: Rashid, Md Hasanur, et al.
Published: (2026)
by: Rashid, Md Hasanur, et al.
Published: (2026)
CARAT: Client-Side Adaptive RPC and Cache Co-Tuning for Parallel File Systems
by: Rashid, Md Hasanur, et al.
Published: (2026)
by: Rashid, Md Hasanur, et al.
Published: (2026)
Efficient Chromosome Parallelization for Precision Medicine Genomic Workflows
by: Montserrat, Daniel Mas, et al.
Published: (2025)
by: Montserrat, Daniel Mas, et al.
Published: (2025)
Massimult: A Novel Parallel CPU Architecture Based on Combinator Reduction
by: Nicklisch-Franken, Jurgen, et al.
Published: (2024)
by: Nicklisch-Franken, Jurgen, et al.
Published: (2024)
Parallel I/O Characterization and Optimization on Large-Scale HPC Systems: A 360-Degree Survey
by: Ather, Hammad, et al.
Published: (2024)
by: Ather, Hammad, et al.
Published: (2024)
Similar Items
-
A Survey on Parallel Text Generation: From Parallel Decoding to Diffusion Language Models
by: Zhang, Lingzhe, et al.
Published: (2025) -
Accelerating Diffusion LLMs via Adaptive Parallel Decoding
by: Israel, Daniel, et al.
Published: (2025) -
HD-MoE: Hybrid and Dynamic Parallelism for Mixture-of-Expert LLMs with 3D Near-Memory Processing
by: Huang, Haochen, et al.
Published: (2025) -
Spatiotemporal Analysis of Parallelized Computing at the Extreme Edge
by: Nabil, Yasser, et al.
Published: (2025) -
EDAN: Towards Understanding Memory Parallelism and Latency Sensitivity in HPC
by: Shen, Siyuan, et al.
Published: (2025)