:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Mengtian, Zhang, Zhekun, Wu, Mingheng, Yan, Jianwen, Sun, Hanshi, Chang, Li-wen
Format:	Preprint
Published:	2026
Subjects:	Distributed, Parallel, and Cluster Computing Artificial Intelligence Machine Learning Programming Languages
Online Access:	https://arxiv.org/abs/2605.17164
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

UniEP: Unified Expert-Parallel MoE MegaKernel for LLM Training
by: Zheng, Size, et al.
Published: (2026)

Evaluating SYCL as a Unified Programming Model for Heterogeneous Systems
by: Marowka, Ami
Published: (2026)

Scaling Deep Learning Training with MPMD Pipeline Parallelism
by: Xhebraj, Anxhelo, et al.
Published: (2024)

Morphling: Fast, Fused, and Flexible GNN Training at Scale
by: Anubhab, et al.
Published: (2025)

LeMix: Unified Scheduling for LLM Training and Inference on Multi-GPU Systems
by: Li, Yufei, et al.
Published: (2025)

Simplicity Scales
by: Sampson, Andrew, et al.
Published: (2026)

ScanWeaver: Compiler-Driven Parallelization of Affine Recurrences via Associative Scan Lowering
by: Wu, Qiying, et al.
Published: (2026)

veScale: Consistent and Efficient Tensor Programming with Eager-Mode SPMD
by: Li, Youjie, et al.
Published: (2025)

MemFine: Memory-Aware Fine-Grained Scheduling for MoE Training
by: Zhao, Lu, et al.
Published: (2025)

Event Tensor: A Unified Abstraction for Compiling Dynamic Megakernel
by: Jin, Hongyi, et al.
Published: (2026)

Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation
by: Feng, Weiqi, et al.
Published: (2024)

Multi-Relational Algebra for Multi-Granular Data Analytics
by: Wu, Xi, et al.
Published: (2023)

Fine-Grained Energy Prediction For Parallellized LLM Inference With PIE-P
by: Dutt, Anurag, et al.
Published: (2025)

Publish on Ping: A Better Way to Publish Reservations in Memory Reclamation for Concurrent Data Structures
by: Singh, Ajay, et al.
Published: (2025)

Timetide: A programming model for logically synchronous distributed systems
by: Kenwright, Logan, et al.
Published: (2025)

Hydra: Virtualized Multi-Language Runtime for High-Density Serverless Platforms
by: Ivanenko, Serhii, et al.
Published: (2022)

MCFuser: High-Performance and Rapid Fusion of Memory-Bound Compute-Intensive Operators
by: Zhang, Zheng, et al.
Published: (2025)

Verifying Properties of Index Arrays in a Purely-Functional Data-Parallel Language
by: Hinnerskov, Nikolaj Hey, et al.
Published: (2025)

An MLIR pipeline for offloading Fortran to FPGAs via OpenMP
by: Rodriguez-Canal, Gabriel, et al.
Published: (2025)

Sal: Multi-modal Verification of Replicated Data Types
by: Ramesh, Pranav, et al.
Published: (2026)

Assessing Opportunities of SYCL for Biological Sequence Alignment on GPU-based Systems
by: Costanzo, Manuel, et al.
Published: (2022)

Flo: a Semantic Foundation for Progressive Stream Processing
by: Laddad, Shadaj, et al.
Published: (2024)

Choreographies as Macros
by: Bohosian, Alexander, et al.
Published: (2025)

OMP4Py: a pure Python implementation of OpenMP
by: Piñeiro, César, et al.
Published: (2024)

Suki: Choreographed Distributed Dataflow in Rust
by: Laddad, Shadaj, et al.
Published: (2024)

Towards a Function-as-a-Service Choreographic Programming Language: Examples and Applications
by: De Palma, Giuseppe, et al.
Published: (2024)

On the Duality of Task and Actor Programming Models
by: Yadav, Rohan, et al.
Published: (2025)

GuStL - An Experimental Guarded States Language
by: Schirmer, Oskar
Published: (2016)

Detrimental task execution patterns in mainstream OpenMP runtimes
by: Tuft, Adam S., et al.
Published: (2024)

We Know I Know You Know; Choreographic Programming With Multicast and Multiply Located Values
by: Bates, Mako, et al.
Published: (2024)

Extending Contract Verification for Parallel Programming Models to Fortran
by: Oraji, Yussur Mustafa, et al.
Published: (2026)

Mat2Boundary: Treating User-Defined Boundary Condition as SpMV for Distributed PDE Solvers on Block-Structured Grids
by: Cai, Yanzheng, et al.
Published: (2026)

Streamlining Cloud-Native Application Development and Deployment with Robust Encapsulation
by: Lertpongrujikorn, Pawissanutt, et al.
Published: (2024)

PRDTs: Composable Knowledge-Based Consensus Protocols with Replicated Data Types
by: Haas, Julian, et al.
Published: (2025)

Distributed Locking as a Data Type
by: Haas, Julian, et al.
Published: (2024)

Fully integrating the Flang Fortran compiler with standard MLIR
by: Brown, Nick
Published: (2024)

Introducing Support for Move Operations in Melda CRDT
by: Brocco, Amos
Published: (2025)

Actor Capabilities for Message Ordering (Extended Version)
by: Gordon, Colin S.
Published: (2025)

Failure Transparency in Stateful Dataflow Systems (Technical Report)
by: Veresov, Aleksey, et al.
Published: (2024)

Mapple: A Domain-Specific Language for Mapping Distributed Programs
by: Wei, Anjiang, et al.
Published: (2025)