Saved in:
| Main Authors: | Guo, Hui, Zheng, Qihang, Huo, Chenghai, Guo, Dongliang, Yang, Haoqi, Zhang, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.21571 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ScaleLLM: A Resource-Frugal LLM Serving Framework by Optimizing End-to-End Efficiency
by: Yao, Yuhang, et al.
Published: (2024)
by: Yao, Yuhang, et al.
Published: (2024)
Improving the End-to-End Efficiency of Offline Inference for Multi-LLM Applications Based on Sampling and Simulation
by: Fang, Jingzhi, et al.
Published: (2025)
by: Fang, Jingzhi, et al.
Published: (2025)
Deal: Distributed End-to-End GNN Inference for All Nodes
by: Chen, Shiyang, et al.
Published: (2025)
by: Chen, Shiyang, et al.
Published: (2025)
StraightLine: An End-to-End Resource-Aware Scheduler for Machine Learning Application Requests
by: Ching, Cheng-Wei, et al.
Published: (2024)
by: Ching, Cheng-Wei, et al.
Published: (2024)
Communication-Efficient Federated Group Distributionally Robust Optimization
by: Guo, Zhishuai, et al.
Published: (2024)
by: Guo, Zhishuai, et al.
Published: (2024)
STHFL: Spatio-Temporal Heterogeneous Federated Learning
by: Guo, Shunxin, et al.
Published: (2025)
by: Guo, Shunxin, et al.
Published: (2025)
Canvas: End-to-End Kernel Architecture Search in Neural Networks
by: Zhao, Chenggang, et al.
Published: (2023)
by: Zhao, Chenggang, et al.
Published: (2023)
vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving
by: Xu, Jiale, et al.
Published: (2024)
by: Xu, Jiale, et al.
Published: (2024)
Communication Resources Constrained Hierarchical Federated Learning for End-to-End Autonomous Driving
by: Kou, Wei-Bin, et al.
Published: (2023)
by: Kou, Wei-Bin, et al.
Published: (2023)
Efficient Unified Caching for Accelerating Heterogeneous AI Workloads
by: Wang, Tianze, et al.
Published: (2025)
by: Wang, Tianze, et al.
Published: (2025)
End-to-End Verifiable Decentralized Federated Learning
by: Lee, Chaehyeon, et al.
Published: (2024)
by: Lee, Chaehyeon, et al.
Published: (2024)
Beyond End-to-End: Dynamic Chain Optimization for Private LLM Adaptation on the Edge
by: Wu, Yebo, et al.
Published: (2026)
by: Wu, Yebo, et al.
Published: (2026)
Agent.xpu: Efficient Scheduling of Agentic LLM Workloads on Heterogeneous SoC
by: Wei, Xinming, et al.
Published: (2025)
by: Wei, Xinming, et al.
Published: (2025)
PubSub-VFL: Towards Efficient Two-Party Split Learning in Heterogeneous Environments via Publisher/Subscriber Architecture
by: Liu, Yi, et al.
Published: (2025)
by: Liu, Yi, et al.
Published: (2025)
Heterogeneity-Aware Cooperative Federated Edge Learning with Adaptive Computation and Communication Compression
by: Zhang, Zhenxiao, et al.
Published: (2024)
by: Zhang, Zhenxiao, et al.
Published: (2024)
A Heterogeneous Chiplet Architecture for Accelerating End-to-End Transformer Models
by: Sharma, Harsh, et al.
Published: (2023)
by: Sharma, Harsh, et al.
Published: (2023)
FT-Transformer: Resilient and Reliable Transformer with End-to-End Fault Tolerant Attention
by: Dai, Huangliang, et al.
Published: (2025)
by: Dai, Huangliang, et al.
Published: (2025)
An End-to-End DNN Inference Framework for the SpiNNaker2 Neuromorphic MPSoC
by: Jobst, Matthias, et al.
Published: (2025)
by: Jobst, Matthias, et al.
Published: (2025)
Agglomerative Federated Learning: Empowering Larger Model Training via End-Edge-Cloud Collaboration
by: Wu, Zhiyuan, et al.
Published: (2023)
by: Wu, Zhiyuan, et al.
Published: (2023)
An Efficient Subspace Algorithm for Federated Learning on Heterogeneous Data
by: Zhang, Jiaojiao, et al.
Published: (2025)
by: Zhang, Jiaojiao, et al.
Published: (2025)
Addressing Skewed Heterogeneity via Federated Prototype Rectification with Personalization
by: Guo, Shunxin, et al.
Published: (2024)
by: Guo, Shunxin, et al.
Published: (2024)
Beyond Model Scale Limits: End-Edge-Cloud Federated Learning with Self-Rectified Knowledge Agglomeration
by: Wu, Zhiyuan, et al.
Published: (2025)
by: Wu, Zhiyuan, et al.
Published: (2025)
FedORGP: Guiding Heterogeneous Federated Learning with Orthogonality Regularization on Global Prototypes
by: Guo, Fucheng, et al.
Published: (2025)
by: Guo, Fucheng, et al.
Published: (2025)
PracMHBench: Re-evaluating Model-Heterogeneous Federated Learning Based on Practical Edge Device Constraints
by: Guo, Yuanchun, et al.
Published: (2025)
by: Guo, Yuanchun, et al.
Published: (2025)
HetCCL: Accelerating LLM Training with Heterogeneous GPUs
by: Kim, Heehoon, et al.
Published: (2026)
by: Kim, Heehoon, et al.
Published: (2026)
MATCHA: Efficient Deployment of Deep Neural Networks on Multi-Accelerator Heterogeneous Edge SoCs
by: Russo, Enrico, et al.
Published: (2026)
by: Russo, Enrico, et al.
Published: (2026)
SMoFi: Step-wise Momentum Fusion for Split Federated Learning on Heterogeneous Data
by: Yang, Mingkun, et al.
Published: (2025)
by: Yang, Mingkun, et al.
Published: (2025)
No Request Left Behind: Tackling Heterogeneity in Long-Context LLM Inference with Medha
by: Agrawal, Amey, et al.
Published: (2024)
by: Agrawal, Amey, et al.
Published: (2024)
Efficient Heterogeneous Large Language Model Decoding with Model-Attention Disaggregation
by: Chen, Shaoyuan, et al.
Published: (2024)
by: Chen, Shaoyuan, et al.
Published: (2024)
MorphServe: Efficient and Workload-Aware LLM Serving via Runtime Quantized Layer Swapping and KV Cache Resizing
by: Su, Zhaoyuan, et al.
Published: (2025)
by: Su, Zhaoyuan, et al.
Published: (2025)
RL in the Wild: Characterizing RLVR Training in LLM Deployment
by: Zhou, Jiecheng, et al.
Published: (2025)
by: Zhou, Jiecheng, et al.
Published: (2025)
AutoSP: Unlocking Long-Context LLM Training Via Compiler-Based Sequence Parallelism
by: Gupta, Ahan, et al.
Published: (2026)
by: Gupta, Ahan, et al.
Published: (2026)
Adacc: An Adaptive Framework Unifying Compression and Activation Recomputation for LLM Training
by: Chen, Ping, et al.
Published: (2025)
by: Chen, Ping, et al.
Published: (2025)
On the Convergence Rates of Federated Q-Learning across Heterogeneous Environments
by: Wang, Leo Muxing, et al.
Published: (2024)
by: Wang, Leo Muxing, et al.
Published: (2024)
A Semi-Supervised Federated Learning Framework with Hierarchical Clustering Aggregation for Heterogeneous Satellite Networks
by: Liu, Zhuocheng, et al.
Published: (2025)
by: Liu, Zhuocheng, et al.
Published: (2025)
SAIR: Cost-Efficient Multi-Stage ML Pipeline Autoscaling via In-Context Reinforcement Learning
by: Su, Jianchang, et al.
Published: (2026)
by: Su, Jianchang, et al.
Published: (2026)
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11)
by: Zhang, Tianyi, et al.
Published: (2025)
by: Zhang, Tianyi, et al.
Published: (2025)
AdaptiveFL: Adaptive Heterogeneous Federated Learning for Resource-Constrained AIoT Systems
by: Jia, Chentao, et al.
Published: (2023)
by: Jia, Chentao, et al.
Published: (2023)
Non-Federated Multi-Task Split Learning for Heterogeneous Sources
by: Zheng, Yilin, et al.
Published: (2024)
by: Zheng, Yilin, et al.
Published: (2024)
Fed-pilot: Optimizing LoRA Allocation for Efficient Federated Fine-Tuning with Heterogeneous Clients
by: Zhang, Zikai, et al.
Published: (2024)
by: Zhang, Zikai, et al.
Published: (2024)
Similar Items
-
ScaleLLM: A Resource-Frugal LLM Serving Framework by Optimizing End-to-End Efficiency
by: Yao, Yuhang, et al.
Published: (2024) -
Improving the End-to-End Efficiency of Offline Inference for Multi-LLM Applications Based on Sampling and Simulation
by: Fang, Jingzhi, et al.
Published: (2025) -
Deal: Distributed End-to-End GNN Inference for All Nodes
by: Chen, Shiyang, et al.
Published: (2025) -
StraightLine: An End-to-End Resource-Aware Scheduler for Machine Learning Application Requests
by: Ching, Cheng-Wei, et al.
Published: (2024) -
Communication-Efficient Federated Group Distributionally Robust Optimization
by: Guo, Zhishuai, et al.
Published: (2024)