:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Baldin, Ilya, Goodrich, Michael, Gyurjyan, Vardan, Heyes, Graham, Howard, Derek, Kumar, Yatish, Lawrence, David, Sawatzky, Brad, Sheldon, Stacey, Timmer, Carl
Format:	Preprint
Published:	2025
Subjects:	Distributed, Parallel, and Cluster Computing
Online Access:	https://arxiv.org/abs/2510.12597
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Introducing JIRIAF: A Virtual Kubelet Integration for Optimizing HPC Resource Provisioning
by: Gyurjyan, Vardan, et al.
Published: (2025)

A 1024 RV-Cores Shared-L1 Cluster with High Bandwidth Memory Link for Low-Latency 6G-SDR
by: Zhang, Yichao, et al.
Published: (2024)

DP-HLS: A High-Level Synthesis Framework for Accelerating Dynamic Programming Algorithms in Bioinformatics
by: Cao, Yingqi, et al.
Published: (2024)

Optimal Parallel Algorithms for Convex Hulls in 2D and 3D under Noisy Primitive Operations
by: Goodrich, Michael T., et al.
Published: (2025)

Serverless Abstractions for Short-Running, Lightweight Streams
by: Carl, Natalie, et al.
Published: (2026)

Bandwidth-Aware Network Topology Optimization for Decentralized Learning
by: Shen, Yipeng, et al.
Published: (2025)

The Carnot Bound: Limits and Possibilities for Bandwidth-Efficient Consensus
by: Lewis-Pye, Andrew, et al.
Published: (2026)

StreamServe: Adaptive Speculative Flows for Low-Latency Disaggregated LLM Serving
by: Kumar, Satyam, et al.
Published: (2026)

Network-Offloaded Bandwidth-Optimal Broadcast and Allgather for Distributed AI
by: Khalilov, Mikhail, et al.
Published: (2024)

Bandwidth-Aware LLM Inference on Heterogeneous Many-Core Supercomputers
by: Lu, Yao, et al.
Published: (2026)

Performance Optimization in Stream Processing Systems: Experiment-Driven Configuration Tuning for Kafka Streams
by: Chen, David, et al.
Published: (2026)

On-Package Memory with Universal Chiplet Interconnect Express (UCIe): A Low Power, High Bandwidth, Low Latency and Low Cost Approach
by: Sharma, Debendra Das, et al.
Published: (2025)

Asynchronous Latency and Fast Atomic Snapshot
by: Bezerra, João Paulo, et al.
Published: (2024)

LA-IMR: Latency-Aware, Predictive In-Memory Routing and Proactive Autoscaling for Tail-Latency-Sensitive Cloud Robotics
by: Seo, Eunil, et al.
Published: (2025)

DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference
by: Wu, Yongtong, et al.
Published: (2026)

DAG it off: Latency Prefers No Common Coins
by: Amores-Sesar, Ignacio, et al.
Published: (2025)

Methodology for GPU Frequency Switching Latency Measurement
by: Velicka, Daniel, et al.
Published: (2025)

Computation-Bandwidth-Memory Trade-offs: A Unified Paradigm for AI Infrastructure
by: Fan, Yuankai, et al.
Published: (2025)

RL over Commodity Networks: Overcoming the Bandwidth Barrier with Lossless Sparse Deltas
by: Ruan, Chaoyi, et al.
Published: (2026)

WANify: Gauging and Balancing Runtime WAN Bandwidth for Geo-distributed Data Analytics
by: Mohapatra, Anshuman Das, et al.
Published: (2025)

Areon: Latency-Friendly and Resilient Multi-Proposer Consensus
by: Castro-Castilla, Álvaro, et al.
Published: (2025)

Dynamic Client Clustering, Bandwidth Allocation, and Workload Optimization for Semi-synchronous Federated Learning
by: Yu, Liangkun, et al.
Published: (2024)

Bandwidth-Aware and Cost-Efficient Pipeline Parallel Scheduling in Geo-Distributed LLM Training
by: Zhang, Han, et al.
Published: (2026)

Revisiting Speculative Leaderless Protocols for Low-Latency BFT Replication
by: Qian, Daniel, et al.
Published: (2026)

Hiding Latencies in Network-Based Image Loading for Deep Learning
by: Versaci, Francesco, et al.
Published: (2025)

CD-Raft: Reducing the Latency of Distributed Consensus in Cross-Domain Sites
by: Wang, Yangyang, et al.
Published: (2026)

Falcon: Advancing Asynchronous BFT Consensus for Lower Latency and Enhanced Throughput
by: Dai, Xiaohai, et al.
Published: (2025)

Memory Offloading for Large Language Model Inference with Latency SLO Guarantees
by: Ma, Chenxiang, et al.
Published: (2025)

Accelerating Mixture-of-Experts Inference by Hiding Offloading Latency with Speculative Decoding
by: Wang, Zhibin, et al.
Published: (2025)

A New Approach for Evaluating the Performance of Distributed Latency-Sensitive Services
by: Theodoropoulos, Theodoros, et al.
Published: (2024)

Shared Memory-Aware Latency-Sensitive Message Aggregation for Fine-Grained Communication
by: Chandrasekar, Kavitha, et al.
Published: (2024)

BBCA-CHAIN: Low Latency, High Throughput BFT Consensus on a DAG
by: Malkhi, Dahlia, et al.
Published: (2023)

Scene-Aware Latency Estimation for Microservices via Multi-Scale Graph Fusion
by: Sun, Zhichao, et al.
Published: (2026)

OOCO: Latency-disaggregated Architecture for Online-Offline Co-locate LLM Serving
by: Wu, Siyu, et al.
Published: (2025)

Shift Parallelism: Low-Latency, High-Throughput LLM Inference for Dynamic Workloads
by: Hidayetoglu, Mert, et al.
Published: (2025)

Low-Latency Layer-Aware Proactive and Passive Container Migration in Meta Computing
by: Liu, Mengjie, et al.
Published: (2024)

Torpor: GPU-Enabled Serverless Computing for Low-Latency, Resource-Efficient Inference
by: Yu, Minchen, et al.
Published: (2023)

Chasing the Speed of Light: Low-Latency Planetary-Scale Adaptive Byzantine Consensus
by: Berger, Christian, et al.
Published: (2023)

HydraServe: Minimizing Cold Start Latency for Serverless LLM Serving in Public Clouds
by: Lou, Chiheng, et al.
Published: (2025)

PCR: A Prefetch-Enhanced Cache Reuse System for Low-Latency RAG Serving
by: Wang, Wenfeng, et al.
Published: (2026)