Saved in:
| Main Authors: | Ding, Zihao, Zhu, Mufeng, Liu, Yao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.07426 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Resilient AI Supercomputer Networking using MRC and SRv6
by: Araujo, Joao, et al.
Published: (2026)
by: Araujo, Joao, et al.
Published: (2026)
AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs
by: Antunes, Pedro, et al.
Published: (2025)
by: Antunes, Pedro, et al.
Published: (2025)
Towards Message Brokers for Generative AI: Survey, Challenges, and Opportunities
by: Saleh, Alaa, et al.
Published: (2023)
by: Saleh, Alaa, et al.
Published: (2023)
Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study
by: Georgiou, Athos
Published: (2026)
by: Georgiou, Athos
Published: (2026)
Neural Router: Semantic Content Matching for Agentic AI
by: Lovén, Lauri, et al.
Published: (2026)
by: Lovén, Lauri, et al.
Published: (2026)
Rotary GPU: Exploring Local Execution Paths for Large Mixture-of-Experts Models Under Limited GPU Memory
by: Jo, Myeong Jun
Published: (2026)
by: Jo, Myeong Jun
Published: (2026)
Towards Policy-Enabled Multi-Hop Routing for Cross-Chain Message Delivery
by: Rezaei, Amin, et al.
Published: (2026)
by: Rezaei, Amin, et al.
Published: (2026)
FedMon: Federated eBPF Monitoring for Distributed Anomaly Detection in Multi-Cluster Cloud Environments
by: Zehra, Sehar, et al.
Published: (2025)
by: Zehra, Sehar, et al.
Published: (2025)
Service Discovery-Based Hybrid Network Middleware for Efficient Communication in Distributed Robotic Systems
by: Sang, Shiyao, et al.
Published: (2025)
by: Sang, Shiyao, et al.
Published: (2025)
CooperLLM: Cloud-Edge-End Cooperative Federated Fine-tuning for LLMs via ZOO-based Gradient Correction
by: Sun, He, et al.
Published: (2026)
by: Sun, He, et al.
Published: (2026)
SLA Management in Reconfigurable Multi-Agent RAG: A Systems Approach to Question Answering
by: Iannelli, Michael, et al.
Published: (2024)
by: Iannelli, Michael, et al.
Published: (2024)
Asynchronous Pipeline Parallelism for Real-Time Multilingual Lip Synchronization in Video Communication Systems
by: Caglar, Eren, et al.
Published: (2025)
by: Caglar, Eren, et al.
Published: (2025)
Predictive Multi-Tier Memory Management for KV Cache in Large-Scale GPU Inference
by: Ganjihal, Sanjeev Rao
Published: (2026)
by: Ganjihal, Sanjeev Rao
Published: (2026)
Comparative Analysis of Large Language Model Inference Serving Systems: A Performance Study of vLLM and HuggingFace TGI
by: Kolluru, Saicharan
Published: (2025)
by: Kolluru, Saicharan
Published: (2025)
Swing: Short-cutting Rings for Higher Bandwidth Allreduce
by: De Sensi, Daniele, et al.
Published: (2024)
by: De Sensi, Daniele, et al.
Published: (2024)
Joint Task Offloading and Routing in Wireless Multi-hop Networks Using Biased Backpressure Algorithm
by: Zhao, Zhongyuan, et al.
Published: (2024)
by: Zhao, Zhongyuan, et al.
Published: (2024)
AAFLOW: Scalable Patterns for Agentic AI Workflows
by: Sarker, Arup Kumar, et al.
Published: (2026)
by: Sarker, Arup Kumar, et al.
Published: (2026)
Collective Communication for 100k+ GPUs
by: Si, Min, et al.
Published: (2025)
by: Si, Min, et al.
Published: (2025)
POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM Inference
by: Kamath, Aditya K, et al.
Published: (2024)
by: Kamath, Aditya K, et al.
Published: (2024)
Training LLMs on HPC Systems: Best Practices from the OpenGPT-X Project
by: Penke, Carolin, et al.
Published: (2025)
by: Penke, Carolin, et al.
Published: (2025)
Cognitive Infrastructure: A Unified DCIM Framework for AI Data Centers
by: Sunkara, Krishna Chaitanya
Published: (2026)
by: Sunkara, Krishna Chaitanya
Published: (2026)
ABACUS: A FinOps Service for Cloud Cost Optimization
by: Deochake, Saurabh
Published: (2024)
by: Deochake, Saurabh
Published: (2024)
Parameter-Efficient and Personalized Federated Training of Generative Models at the Edge
by: Khan, Kabir, et al.
Published: (2025)
by: Khan, Kabir, et al.
Published: (2025)
Autonomous Trajectory Optimization for UAVs in Disaster Zone Using Henry Gas Optimization Scheme
by: Qadir, Zakria, et al.
Published: (2025)
by: Qadir, Zakria, et al.
Published: (2025)
Directives for Function Offloading in 5G Networks Based on a Performance Characteristics Analysis
by: Dettinger, Falk, et al.
Published: (2025)
by: Dettinger, Falk, et al.
Published: (2025)
DynamiQ: Accelerating Gradient Synchronization using Compressed Multi-hop All-reduce
by: Han, Wenchen, et al.
Published: (2026)
by: Han, Wenchen, et al.
Published: (2026)
Konnektor: Connection Protocol for Ensuring Peer Uniqueness in Decentralized P2P Networks
by: Ozkan, Onur
Published: (2024)
by: Ozkan, Onur
Published: (2024)
DSDE: Dynamic Speculative Decoding with KLD Stability for Real-World Serving
by: Yang, Mingyu, et al.
Published: (2025)
by: Yang, Mingyu, et al.
Published: (2025)
Augmenting the FedProx Algorithm by Minimizing Convergence
by: Sarkar, Anomitra, et al.
Published: (2024)
by: Sarkar, Anomitra, et al.
Published: (2024)
A Survey on Heterogeneous Computing Using SmartNICs and Emerging Data Processing Units
by: Tibbetts, Nathan, et al.
Published: (2025)
by: Tibbetts, Nathan, et al.
Published: (2025)
Moonshot: Optimizing Chain-Based Rotating Leader BFT via Optimistic Proposals
by: Doidge, Isaac, et al.
Published: (2024)
by: Doidge, Isaac, et al.
Published: (2024)
ConfigSpec: Profiling-Based Configuration Selection for Distributed Edge--Cloud Speculative LLM Serving
by: Li, Xiangchen, et al.
Published: (2026)
by: Li, Xiangchen, et al.
Published: (2026)
WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching
by: Li, Xiangchen, et al.
Published: (2026)
by: Li, Xiangchen, et al.
Published: (2026)
Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks
by: Geng, Jinkun, et al.
Published: (2022)
by: Geng, Jinkun, et al.
Published: (2022)
Exploring Micro Frontends: A Case Study Application in E-Commerce
by: Kojo, Ricardo Hideki Hangai, et al.
Published: (2025)
by: Kojo, Ricardo Hideki Hangai, et al.
Published: (2025)
ATTNChecker: Highly-Optimized Fault Tolerant Attention for Large Language Model Training
by: Liang, Yuhang, et al.
Published: (2024)
by: Liang, Yuhang, et al.
Published: (2024)
Link-Sharing Backpressure Routing In Wireless Multi-Hop Networks
by: Zhao, Zhongyuan, et al.
Published: (2025)
by: Zhao, Zhongyuan, et al.
Published: (2025)
Benchmarking Federated Learning for Throughput Prediction in 5G Live Streaming Applications
by: Dutta, Yuvraj, et al.
Published: (2025)
by: Dutta, Yuvraj, et al.
Published: (2025)
LO2: Microservice API Anomaly Dataset of Logs and Metrics
by: Bakhtin, Alexander, et al.
Published: (2025)
by: Bakhtin, Alexander, et al.
Published: (2025)
Smells-sus: Sustainability Smells in IaC
by: Kosbar, Seif, et al.
Published: (2025)
by: Kosbar, Seif, et al.
Published: (2025)
Similar Items
-
Resilient AI Supercomputer Networking using MRC and SRv6
by: Araujo, Joao, et al.
Published: (2026) -
AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs
by: Antunes, Pedro, et al.
Published: (2025) -
Towards Message Brokers for Generative AI: Survey, Challenges, and Opportunities
by: Saleh, Alaa, et al.
Published: (2023) -
Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study
by: Georgiou, Athos
Published: (2026) -
Neural Router: Semantic Content Matching for Agentic AI
by: Lovén, Lauri, et al.
Published: (2026)