:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ding, Zihao, Zhu, Mufeng, Liu, Yao
Format:	Preprint
Published:	2025
Subjects:	Distributed, Parallel, and Cluster Computing Artificial Intelligence Computation and Language Networking and Internet Architecture Software Engineering C.2.2; C.4; I.2.7
Online Access:	https://arxiv.org/abs/2511.07426
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Resilient AI Supercomputer Networking using MRC and SRv6
by: Araujo, Joao, et al.
Published: (2026)

AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs
by: Antunes, Pedro, et al.
Published: (2025)

Towards Message Brokers for Generative AI: Survey, Challenges, and Opportunities
by: Saleh, Alaa, et al.
Published: (2023)

Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study
by: Georgiou, Athos
Published: (2026)

Neural Router: Semantic Content Matching for Agentic AI
by: Lovén, Lauri, et al.
Published: (2026)

Rotary GPU: Exploring Local Execution Paths for Large Mixture-of-Experts Models Under Limited GPU Memory
by: Jo, Myeong Jun
Published: (2026)

Towards Policy-Enabled Multi-Hop Routing for Cross-Chain Message Delivery
by: Rezaei, Amin, et al.
Published: (2026)

FedMon: Federated eBPF Monitoring for Distributed Anomaly Detection in Multi-Cluster Cloud Environments
by: Zehra, Sehar, et al.
Published: (2025)

Service Discovery-Based Hybrid Network Middleware for Efficient Communication in Distributed Robotic Systems
by: Sang, Shiyao, et al.
Published: (2025)

CooperLLM: Cloud-Edge-End Cooperative Federated Fine-tuning for LLMs via ZOO-based Gradient Correction
by: Sun, He, et al.
Published: (2026)

SLA Management in Reconfigurable Multi-Agent RAG: A Systems Approach to Question Answering
by: Iannelli, Michael, et al.
Published: (2024)

Asynchronous Pipeline Parallelism for Real-Time Multilingual Lip Synchronization in Video Communication Systems
by: Caglar, Eren, et al.
Published: (2025)

Predictive Multi-Tier Memory Management for KV Cache in Large-Scale GPU Inference
by: Ganjihal, Sanjeev Rao
Published: (2026)

Comparative Analysis of Large Language Model Inference Serving Systems: A Performance Study of vLLM and HuggingFace TGI
by: Kolluru, Saicharan
Published: (2025)

Swing: Short-cutting Rings for Higher Bandwidth Allreduce
by: De Sensi, Daniele, et al.
Published: (2024)

Joint Task Offloading and Routing in Wireless Multi-hop Networks Using Biased Backpressure Algorithm
by: Zhao, Zhongyuan, et al.
Published: (2024)

AAFLOW: Scalable Patterns for Agentic AI Workflows
by: Sarker, Arup Kumar, et al.
Published: (2026)

Collective Communication for 100k+ GPUs
by: Si, Min, et al.
Published: (2025)

POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM Inference
by: Kamath, Aditya K, et al.
Published: (2024)

Training LLMs on HPC Systems: Best Practices from the OpenGPT-X Project
by: Penke, Carolin, et al.
Published: (2025)

Cognitive Infrastructure: A Unified DCIM Framework for AI Data Centers
by: Sunkara, Krishna Chaitanya
Published: (2026)

ABACUS: A FinOps Service for Cloud Cost Optimization
by: Deochake, Saurabh
Published: (2024)

Parameter-Efficient and Personalized Federated Training of Generative Models at the Edge
by: Khan, Kabir, et al.
Published: (2025)

Autonomous Trajectory Optimization for UAVs in Disaster Zone Using Henry Gas Optimization Scheme
by: Qadir, Zakria, et al.
Published: (2025)

Directives for Function Offloading in 5G Networks Based on a Performance Characteristics Analysis
by: Dettinger, Falk, et al.
Published: (2025)

DynamiQ: Accelerating Gradient Synchronization using Compressed Multi-hop All-reduce
by: Han, Wenchen, et al.
Published: (2026)

Konnektor: Connection Protocol for Ensuring Peer Uniqueness in Decentralized P2P Networks
by: Ozkan, Onur
Published: (2024)

DSDE: Dynamic Speculative Decoding with KLD Stability for Real-World Serving
by: Yang, Mingyu, et al.
Published: (2025)

Augmenting the FedProx Algorithm by Minimizing Convergence
by: Sarkar, Anomitra, et al.
Published: (2024)

A Survey on Heterogeneous Computing Using SmartNICs and Emerging Data Processing Units
by: Tibbetts, Nathan, et al.
Published: (2025)

Moonshot: Optimizing Chain-Based Rotating Leader BFT via Optimistic Proposals
by: Doidge, Isaac, et al.
Published: (2024)

ConfigSpec: Profiling-Based Configuration Selection for Distributed Edge--Cloud Speculative LLM Serving
by: Li, Xiangchen, et al.
Published: (2026)

WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching
by: Li, Xiangchen, et al.
Published: (2026)

Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks
by: Geng, Jinkun, et al.
Published: (2022)

Exploring Micro Frontends: A Case Study Application in E-Commerce
by: Kojo, Ricardo Hideki Hangai, et al.
Published: (2025)

ATTNChecker: Highly-Optimized Fault Tolerant Attention for Large Language Model Training
by: Liang, Yuhang, et al.
Published: (2024)

Link-Sharing Backpressure Routing In Wireless Multi-Hop Networks
by: Zhao, Zhongyuan, et al.
Published: (2025)

Benchmarking Federated Learning for Throughput Prediction in 5G Live Streaming Applications
by: Dutta, Yuvraj, et al.
Published: (2025)

LO2: Microservice API Anomaly Dataset of Logs and Metrics
by: Bakhtin, Alexander, et al.
Published: (2025)

Smells-sus: Sustainability Smells in IaC
by: Kosbar, Seif, et al.
Published: (2025)