Saved in:
| Main Authors: | Wang, Zerui, Liu, Yan, Huang, Jun |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.03376 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving
by: Liu, Zedong, et al.
Published: (2026)
by: Liu, Zedong, et al.
Published: (2026)
NebulaFL: Effective Asynchronous Federated Learning for JointCloud Computing
by: Gao, Fei, et al.
Published: (2024)
by: Gao, Fei, et al.
Published: (2024)
Generative AI on the Edge: Architecture and Performance Evaluation
by: Nezami, Zeinab, et al.
Published: (2024)
by: Nezami, Zeinab, et al.
Published: (2024)
AI Greenferencing: Routing AI Inferencing to Green Modular Data Centers with Heron
by: Reddy, Tella Rajashekhar, et al.
Published: (2025)
by: Reddy, Tella Rajashekhar, et al.
Published: (2025)
Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices
by: Liu, Jun, et al.
Published: (2024)
by: Liu, Jun, et al.
Published: (2024)
FogROS2-FT: Fault Tolerant Cloud Robotics
by: Chen, Kaiyuan, et al.
Published: (2024)
by: Chen, Kaiyuan, et al.
Published: (2024)
High-speed Networking for Giga-Scale AI Factories
by: Khashab, Sajy, et al.
Published: (2026)
by: Khashab, Sajy, et al.
Published: (2026)
Smaller, Smarter, Closer: The Edge of Collaborative Generative AI
by: Morabito, Roberto, et al.
Published: (2025)
by: Morabito, Roberto, et al.
Published: (2025)
Trust-Aware Routing for Distributed Generative AI Inference at the Edge
by: Nguyen, Chanh, et al.
Published: (2026)
by: Nguyen, Chanh, et al.
Published: (2026)
Towards Net-Zero Carbon Emissions in Network AI for 6G and Beyond
by: Zhang, Peng, et al.
Published: (2023)
by: Zhang, Peng, et al.
Published: (2023)
ScaleAcross Explorer: Exploring Communication Optimization for Scale-Across AI Model Training
by: Li, Minghao, et al.
Published: (2026)
by: Li, Minghao, et al.
Published: (2026)
SpaceMoE: Realizing Distributed Mixture-of-Experts Inference over Space Networks
by: Wang, Zhanwei, et al.
Published: (2026)
by: Wang, Zhanwei, et al.
Published: (2026)
Meili: Enabling SmartNIC as a Service in the Cloud
by: Su, Qiang, et al.
Published: (2023)
by: Su, Qiang, et al.
Published: (2023)
When Digital Twin Meets 6G: Concepts, Obstacles, and Research Prospects
by: Liu, Wenshuai, et al.
Published: (2024)
by: Liu, Wenshuai, et al.
Published: (2024)
Rina: Enhancing Ring-AllReduce with In-network Aggregation in Distributed Model Training
by: Chen, Zixuan, et al.
Published: (2024)
by: Chen, Zixuan, et al.
Published: (2024)
An Open-Source Experimentation Framework for the Edge Cloud Continuum
by: Koukis, Georgios, et al.
Published: (2024)
by: Koukis, Georgios, et al.
Published: (2024)
$Λ$-Split: A Privacy-Preserving Split Computing Framework for Cloud-Powered Generative AI
by: Ohta, Shoki, et al.
Published: (2023)
by: Ohta, Shoki, et al.
Published: (2023)
Agentic Performance at the Edge: Insights from Benchmarking
by: Wang, Shiqiang, et al.
Published: (2026)
by: Wang, Shiqiang, et al.
Published: (2026)
Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges
by: Chen, Handi, et al.
Published: (2024)
by: Chen, Handi, et al.
Published: (2024)
Towards Practical Operation of Deep Reinforcement Learning Agents in Real-World Network Management at Open RAN Edges
by: Li, Haiyuan, et al.
Published: (2024)
by: Li, Haiyuan, et al.
Published: (2024)
HALO: Semantic-Aware Distributed LLM Inference in Lossy Edge Network
by: Zheng, Peirong, et al.
Published: (2026)
by: Zheng, Peirong, et al.
Published: (2026)
PerLLM: Personalized Inference Scheduling with Edge-Cloud Collaboration for Diverse LLM Services
by: Yang, Zheming, et al.
Published: (2024)
by: Yang, Zheming, et al.
Published: (2024)
Digital Twinning of a Pressurized Water Reactor Startup Operation and Partial Computational Offloading in In-network Computing-Assisted Multiaccess Edge Computing
by: Aliyu, Ibrahim, et al.
Published: (2024)
by: Aliyu, Ibrahim, et al.
Published: (2024)
Context-Aware Orchestration of Energy-Efficient Gossip Learning Schemes
by: Dinani, Mina Aghaei, et al.
Published: (2024)
by: Dinani, Mina Aghaei, et al.
Published: (2024)
When IoT Meet LLMs: Applications and Challenges
by: Kok, Ibrahim, et al.
Published: (2024)
by: Kok, Ibrahim, et al.
Published: (2024)
Design and Optimization of Hierarchical Gradient Coding for Distributed Learning at Edge Devices
by: Tang, Weiheng, et al.
Published: (2024)
by: Tang, Weiheng, et al.
Published: (2024)
Teola: Towards End-to-End Optimization of LLM-based Applications
by: Tan, Xin, et al.
Published: (2024)
by: Tan, Xin, et al.
Published: (2024)
Collective Communication Profiling of Modern-day Machine Learning Workloads
by: Gupta, Jit, et al.
Published: (2025)
by: Gupta, Jit, et al.
Published: (2025)
Optimizing Resource Allocation for Geographically-Distributed Inference by Large Language Models
by: Sun, Tingyang, et al.
Published: (2025)
by: Sun, Tingyang, et al.
Published: (2025)
Optimizing Split Learning Latency in TinyML-Based IoT Systems
by: Jenhani, Zied, et al.
Published: (2025)
by: Jenhani, Zied, et al.
Published: (2025)
Cluster Topology-Driven Placement of Experts Reduces Network Traffic in MoE Inference
by: Sivtsov, Danil, et al.
Published: (2025)
by: Sivtsov, Danil, et al.
Published: (2025)
Move the Query, Not the Cache: Characterizing Cross-Instance Latent Attention Redistribution Across GPU Fabrics
by: Ma, Bole, et al.
Published: (2026)
by: Ma, Bole, et al.
Published: (2026)
The Implications of Decentralization in Blockchained Federated Learning: Evaluating the Impact of Model Staleness and Inconsistencies
by: Wilhelmi, Francesc, et al.
Published: (2023)
by: Wilhelmi, Francesc, et al.
Published: (2023)
Enabling Reconfiguration-Communication Overlap for Collective Communication in Optical Networks
by: Wu, Changbo, et al.
Published: (2025)
by: Wu, Changbo, et al.
Published: (2025)
eACGM: Non-instrumented Performance Tracing and Anomaly Detection towards Machine Learning Systems
by: Xu, Ruilin, et al.
Published: (2025)
by: Xu, Ruilin, et al.
Published: (2025)
Jupiter: Fast and Resource-Efficient Collaborative Inference of Generative LLMs on Edge Devices
by: Ye, Shengyuan, et al.
Published: (2025)
by: Ye, Shengyuan, et al.
Published: (2025)
Enabling Intelligent Vehicular Networks Through Distributed Learning in the Non-Terrestrial Networks 6G Vision
by: Naseh, David, et al.
Published: (2023)
by: Naseh, David, et al.
Published: (2023)
XWind: A Cross-site Router for Large Language Model Inference Serving at Renewable Energy Farms
by: Reddy, Tella Rajashekhar, et al.
Published: (2026)
by: Reddy, Tella Rajashekhar, et al.
Published: (2026)
Intelligent Task Offloading: Advanced MEC Task Offloading and Resource Management in 5G Networks
by: Ebrahimi, Alireza, et al.
Published: (2025)
by: Ebrahimi, Alireza, et al.
Published: (2025)
AIvailable: A Software-Defined Architecture for LLM-as-a-Service on Heterogeneous and Legacy GPUs
by: Antunes, Pedro, et al.
Published: (2025)
by: Antunes, Pedro, et al.
Published: (2025)
Similar Items
-
KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving
by: Liu, Zedong, et al.
Published: (2026) -
NebulaFL: Effective Asynchronous Federated Learning for JointCloud Computing
by: Gao, Fei, et al.
Published: (2024) -
Generative AI on the Edge: Architecture and Performance Evaluation
by: Nezami, Zeinab, et al.
Published: (2024) -
AI Greenferencing: Routing AI Inferencing to Green Modular Data Centers with Heron
by: Reddy, Tella Rajashekhar, et al.
Published: (2025) -
Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices
by: Liu, Jun, et al.
Published: (2024)