:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Peng, Liu, Yu, Liu, Ziqi, Wang, Ming-Yang, Liu, Ke, Zhou, Ke, Huang, Zhihai
Format:	Preprint
Published:	2024
Subjects:	Distributed, Parallel, and Cluster Computing
Online Access:	https://arxiv.org/abs/2402.16262
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

OnePiece: A Large-Scale Distributed Inference System with RDMA for Complex AI-Generated Content (AIGC) Workflows
by: Chen, June, et al.
Published: (2026)

CoCoI: Distributed Coded Inference System for Straggler Mitigation
by: Liu, Xing, et al.
Published: (2025)

Content-Oblivious Leader Election in 2-Edge-Connected Networks
by: Chalopin, Jérémie, et al.
Published: (2025)

Big Data-Driven Fraud Detection Using Machine Learning and Real-Time Stream Processing
by: Liu, Chen, et al.
Published: (2025)

PROSERVE: Unified Multi-Priority Request Scheduling for LLM Serving
by: Huang, Weizhe, et al.
Published: (2025)

OOCO: Latency-disaggregated Architecture for Online-Offline Co-locate LLM Serving
by: Wu, Siyu, et al.
Published: (2025)

LLM-CoOpt: A Co-Design and Optimization Framework for Efficient LLM Inference on Heterogeneous Platforms
by: Kong, Jie, et al.
Published: (2026)

Efficient Counting and Simulation in Content-Oblivious Rings
by: Chalopin, Jérémie, et al.
Published: (2026)

Enabling Efficient Batch Serving for LMaaS via Generation Length Prediction
by: Cheng, Ke, et al.
Published: (2024)

FaaSTube: Optimizing GPU-oriented Data Transfer for Serverless Computing
by: Wu, Hao, et al.
Published: (2024)

HydraInfer: Hybrid Disaggregated Scheduling for Multimodal Large Language Model Serving
by: Dong, Xianzhe, et al.
Published: (2025)

Integrated Sensing, Communication, and Computing: An Information-oriented Resource Transaction Mechanism
by: Chen, Ning, et al.
Published: (2024)

A Survey on Adversarial Contention Resolution
by: Banicescu, Ioana, et al.
Published: (2024)

Softening the Impact of Collisions in Contention Resolution
by: Biswas, Umesh, et al.
Published: (2024)

Beyond 2-Edge-Connectivity: Algorithms and Impossibility for Content-Oblivious Leader Election
by: Chang, Yi-Jun, et al.
Published: (2025)

Non-Uniform Content-Oblivious Leader Election on Oriented Asynchronous Rings
by: Chalopin, Jérémie, et al.
Published: (2025)

HexGen: Generative Inference of Large Language Model over Heterogeneous Environment
by: Jiang, Youhe, et al.
Published: (2023)

TIDAL: Recovering Temporal Phase for Cloud Block Storage Placement from LLM-Derived Semantics
by: Tan, Difan, et al.
Published: (2026)

QoE-oriented Dependent Task Scheduling under Multi-dimensional QoS Constraints over Distributed Networks
by: Fan, Xuwei, et al.
Published: (2023)

Graph-Structured Deep Learning Framework for Multi-task Contention Identification with High-dimensional Metrics
by: Yang, Xiao, et al.
Published: (2026)

A Contention-Free Model for Converged Kubernetes on HPC
by: Sochat, Vanessa, et al.
Published: (2024)

Warp-STAR: High-performance, Differentiable GPU-Accelerated Static Timing Analysis through Warp-oriented Parallel Orchestration
by: Huang, En-Ming, et al.
Published: (2026)

FedHC: A Hierarchical Clustered Federated Learning Framework for Satellite Networks
by: Liu, Zhuocheng, et al.
Published: (2025)

Slice-Level Scheduling for High Throughput and Load Balanced LLM Serving
by: Cheng, Ke, et al.
Published: (2024)

HexAGenT: Efficient Agentic LLM Serving via Workflow- and Heterogeneity-Aware Scheduling
by: Peng, You, et al.
Published: (2026)

HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
by: Zhao, Xuanlei, et al.
Published: (2024)

GPU-Accelerated Batch-Dynamic Subgraph Matching
by: Qiu, Linshan, et al.
Published: (2024)

Accelerating Microswimmer Simulations via a Heterogeneous Pipelined Parallel-in-Time Framework
by: Huang, Ruixiang, et al.
Published: (2026)

Harpagon: Minimizing DNN Serving Cost via Efficient Dispatching, Scheduling and Splitting
by: Zhao, Zhixin, et al.
Published: (2024)

FedCache: A Knowledge Cache-driven Federated Learning Architecture for Personalized Edge Intelligence
by: Wu, Zhiyuan, et al.
Published: (2023)

Arrow: Adaptive Scheduling Mechanisms for Disaggregated LLM Inference Architecture
by: Wu, Yu, et al.
Published: (2025)

A Thorough Investigation of Content-Defined Chunking Algorithms for Data Deduplication
by: Gregoriadis, Marcel, et al.
Published: (2024)

AdaBridge: Dynamic Data and Computation Reuse for Efficient Multi-task DNN Co-evolution in Edge Systems
by: Wang, Lehao, et al.
Published: (2024)

ServeGen: Workload Characterization and Generation of Large Language Model Serving in Production
by: Xiang, Yuxing, et al.
Published: (2025)

SCOOT: SLO-Oriented Performance Tuning for LLM Inference Engines
by: Cheng, Ke, et al.
Published: (2024)

Accelerating the Delivery of Data Services over Uncertain Mobile Crowdsensing Networks
by: Liwang, Minghui, et al.
Published: (2022)

DiT-HC: Enabling Efficient Training of Visual Generation Model DiT on HPC-oriented CPU Cluster
by: Zhang, Jinxiao, et al.
Published: (2026)

ConChain: A Scheme for Contention-free and Attack Resilient BlockChain
by: Bappy, Faisal Haque, et al.
Published: (2023)

BandPilot: Towards Performance- and Contention-Aware GPU Dispatching in AI Clusters
by: Zhang, Kunming, et al.
Published: (2025)

Contention Resolution, With and Without a Global Clock
by: Cai, Zixi, et al.
Published: (2026)