:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhu, Jun, Xu, Yin, He, Dazhi, Li, Haoyang, Guan, Yunfeng, Zhang, Wenjun, Ma, Tianyao, Yuan, Haozhi
Format:	Preprint
Published:	2025
Subjects:	Distributed, Parallel, and Cluster Computing
Online Access:	https://arxiv.org/abs/2503.10525
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Efficient Parallel Implementation of the Pilot Assignment Problem in Massive MIMO Systems
by: Alqudah, Eman, et al.
Published: (2025)

HopGNN: Boosting Distributed GNN Training Efficiency via Feature-Centric Model Migration
by: Chen, Weijian, et al.
Published: (2024)

Distributed system perspective on Backscatter systems
by: Guan, Jincheng, et al.
Published: (2025)

Efficient Architecture for RISC-V Vector Memory Access
by: Guan, Hongyi, et al.
Published: (2025)

Unfolding an Atomistic World: Atomistic Simulation of Reactor Pressure Vessel Steel Across Year-and-Meter Scales
by: Han, Haozhi, et al.
Published: (2026)

MTGenRec: An Efficient Distributed Training System for Generative Recommendation Models in Meituan
by: Wang, Yuxiang, et al.
Published: (2025)

FedCod: An Efficient Communication Protocol for Cross-Silo Federated Learning with Coding
by: Yan, Peishen, et al.
Published: (2024)

GreenLLM: Disaggregating Large Language Model Serving on Heterogeneous GPUs for Lower Carbon Emissions
by: Shi, Tianyao, et al.
Published: (2024)

AME: An Efficient Heterogeneous Agentic Memory Engine for Smartphones
by: Zhao, Xinkui, et al.
Published: (2025)

Heimdall++: Optimizing GPU Utilization and Pipeline Parallelism for Efficient Single-Pulse Detection
by: Xia, Bingzheng, et al.
Published: (2025)

Cascadia: An Efficient Cascade Serving System for Large Language Models
by: Jiang, Youhe, et al.
Published: (2025)

XMiner: Efficient Directed Subgraph Matching with Pattern Reduction
by: Yuan, Pingpeng, et al.
Published: (2024)

AgentServe: Algorithm-System Co-Design for Efficient Agentic AI Serving on a Consumer-Grade GPU
by: Zhang, Yuning, et al.
Published: (2026)

Unleashing Efficient Asynchronous RL Post-Training via Staleness-Constrained Rollout Coordination
by: Li, Haoyang, et al.
Published: (2026)

LatencyPrism: Online Non-intrusive Latency Sculpting for SLO-Guaranteed LLM Inference
by: Du, Yin, et al.
Published: (2026)

Efficient Pre-Training of LLMs via Topology-Aware Communication Alignment on More Than 9600 GPUs
by: He, Guoliang, et al.
Published: (2025)

Hetu v2: A General and Scalable Deep Learning System with Hierarchical and Heterogeneous Single Program Multiple Data Annotations
by: Li, Haoyang, et al.
Published: (2025)

6G Twin: Hybrid Gaussian Radio Fields for Channel Estimation and Non-Linear Precoder Design for Radio Access Networks
by: Mohsin, Muhammad Ahmed, et al.
Published: (2025)

An Efficient, Reliable and Observable Collective Communication Library in Large-scale GPU Training Clusters
by: Zhang, Mingjun, et al.
Published: (2025)

HexAGenT: Efficient Agentic LLM Serving via Workflow- and Heterogeneity-Aware Scheduling
by: Peng, You, et al.
Published: (2026)

UELLM: A Unified and Efficient Approach for LLM Inference Serving
by: He, Yiyuan, et al.
Published: (2024)

MegatronApp: Efficient and Comprehensive Management on Distributed LLM Training
by: Zhao, Bohan, et al.
Published: (2025)

Hecate: Unlocking Efficient Sparse Model Training via Fully Sharded Sparse Data Parallelism
by: Qing, Yuhao, et al.
Published: (2025)

CALVO: Improve Serving Efficiency for LLM Inferences with Intense Network Demands
by: Wang, Weiye, et al.
Published: (2026)

Efficient Long Context Fine-tuning with Chunk Flow
by: Yuan, Xiulong, et al.
Published: (2025)

D-CAST: Distributed Consensus Switch in Wireless Trustworthy Autonomous System
by: Yu, Dachao, et al.
Published: (2024)

Cloud-native and Distributed Systems for Efficient and Scalable Large Language Models -- A Research Agenda
by: Xu, Minxian, et al.
Published: (2026)

Cloud Native System for LLM Inference Serving
by: Xu, Minxian, et al.
Published: (2025)

Secure Communication in the Presence of an RIS-Enhanced Eavesdropper in MIMO Networks
by: Zhang, Gaoyuan, et al.
Published: (2025)

Multi-Path Bound for DAG Tasks
by: He, Qingqiang, et al.
Published: (2023)

Multi-Modal Style Transfer-based Prompt Tuning for Efficient Federated Domain Generalization
by: Chen, Yuliang, et al.
Published: (2026)

Loki: A System for Serving ML Inference Pipelines with Hardware and Accuracy Scaling
by: Ahmad, Sohaib, et al.
Published: (2024)

DistFlow: A Fully Distributed RL Framework for Scalable and Efficient LLM Post-Training
by: Wang, Zhixin, et al.
Published: (2025)

Collaborative Inference in DNN-based Satellite Systems with Dynamic Task Streams
by: Guan, Jinglong, et al.
Published: (2023)

GRNND: A GPU-Parallel Relative NN-Descent Algorithm for Efficient Approximate Nearest Neighbor Graph Construction
by: Li, Xiang, et al.
Published: (2025)

Hyperion: Low-Latency Ultra-HD Video Analytics via Collaborative Vision Transformer Inference
by: Jiang, Linyi, et al.
Published: (2025)

An All-Reduce Compatible Top-K Compressor for Communication-Efficient Distributed Learning
by: Chen, Chuyan, et al.
Published: (2025)

BLOCKS: Blockchain-supported Cross-Silo Knowledge Sharing for Efficient LLM Services
by: Zhou, Zhaojiacheng, et al.
Published: (2025)

PUSHtap: PIM-based In-Memory HTAP with Unified Data Storage Format
by: Zhao, Yilong, et al.
Published: (2025)

FourierCompress: Layer-Aware Spectral Activation Compression for Efficient and Accurate Collaborative LLM Inference
by: Ma, Jian, et al.
Published: (2025)