Saved in:
| Main Authors: | Yan, Jinqi, He, Fang, Sang, Qianlong, Tong, Bifeng, Sun, Peng, Gong, Yili, Hu, Chuang, Cheng, Dazhao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.22707 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Exploring Uncore Frequency Scaling for Heterogeneous Computing
by: Zheng, Zhong, et al.
Published: (2025)
by: Zheng, Zhong, et al.
Published: (2025)
Benchmarking Different Application Types across Heterogeneous Cloud Compute Services
by: Duggi, Nivedhitha, et al.
Published: (2025)
by: Duggi, Nivedhitha, et al.
Published: (2025)
MCFuser: High-Performance and Rapid Fusion of Memory-Bound Compute-Intensive Operators
by: Zhang, Zheng, et al.
Published: (2025)
by: Zhang, Zheng, et al.
Published: (2025)
Enabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharing
by: Zhao, Lingxiao, et al.
Published: (2025)
by: Zhao, Lingxiao, et al.
Published: (2025)
MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism
by: Zhang, Zheng, et al.
Published: (2025)
by: Zhang, Zheng, et al.
Published: (2025)
Understanding and Reducing Metadata-Driven Host Overheads in Sampling-Based GNN Training
by: Gong, Yidong, et al.
Published: (2026)
by: Gong, Yidong, et al.
Published: (2026)
MIDAS: Adaptive Proxy Middleware for Mitigating Metadata Hotspots in HPC I/O at Scale
by: Ghimire, Sangam, et al.
Published: (2025)
by: Ghimire, Sangam, et al.
Published: (2025)
SimDC: A High-Fidelity Device Simulation Platform for Device-Cloud Collaborative Computing
by: Pei, Ruiguang, et al.
Published: (2025)
by: Pei, Ruiguang, et al.
Published: (2025)
FedAPTA: Federated Multi-task Learning for Heterogeneous Devices with Adaptive Layer-wise Pruning and Task-aware Aggregation
by: Yu, Zhen, et al.
Published: (2025)
by: Yu, Zhen, et al.
Published: (2025)
Scaling Up Throughput-oriented LLM Inference Applications on Heterogeneous Opportunistic GPU Clusters with Pervasive Context Management
by: Phung, Thanh Son, et al.
Published: (2025)
by: Phung, Thanh Son, et al.
Published: (2025)
High-Throughput LLM inference on Heterogeneous Clusters
by: Xiong, Yi, et al.
Published: (2025)
by: Xiong, Yi, et al.
Published: (2025)
HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
by: Zhao, Xuanlei, et al.
Published: (2024)
by: Zhao, Xuanlei, et al.
Published: (2024)
Automatic Metadata Capture and Processing for High-Performance Workflows
by: Shpilker, Polina, et al.
Published: (2025)
by: Shpilker, Polina, et al.
Published: (2025)
HGraphScale: Hierarchical Graph Learning for Autoscaling Microservice Applications in Container-based Cloud Computing
by: Fang, Zhengxin, et al.
Published: (2025)
by: Fang, Zhengxin, et al.
Published: (2025)
Memory-Efficient Split Federated Learning for LLM Fine-Tuning on Heterogeneous Mobile Devices
by: Chen, Xiaopei, et al.
Published: (2025)
by: Chen, Xiaopei, et al.
Published: (2025)
FREESH: Fair, Resource- and Energy-Efficient Scheduling for LLM Serving on Heterogeneous GPUs
by: He, Xuan, et al.
Published: (2025)
by: He, Xuan, et al.
Published: (2025)
Bridging Memory Gaps: Scaling Federated Learning for Heterogeneous Clients
by: Wu, Yebo, et al.
Published: (2024)
by: Wu, Yebo, et al.
Published: (2024)
WindVE: Collaborative CPU-NPU Vector Embedding
by: Huang, Jinqi, et al.
Published: (2025)
by: Huang, Jinqi, et al.
Published: (2025)
SLO-Aware Scheduling for Large Language Model Inferences
by: Huang, Jinqi, et al.
Published: (2025)
by: Huang, Jinqi, et al.
Published: (2025)
Single-Loop Federated Actor-Critic across Heterogeneous Environments
by: Zhu, Ye, et al.
Published: (2024)
by: Zhu, Ye, et al.
Published: (2024)
Leveraging Core and Uncore Frequency Scaling for Power-Efficient Serverless Workflows
by: Tzenetopoulos, Achilleas, et al.
Published: (2024)
by: Tzenetopoulos, Achilleas, et al.
Published: (2024)
Failure-Resilient Distributed Inference with Model Compression over Heterogeneous Edge Devices
by: Wang, Li, et al.
Published: (2024)
by: Wang, Li, et al.
Published: (2024)
A Robust Federated Learning Framework for Undependable Devices at Scale
by: Wang, Shilong, et al.
Published: (2024)
by: Wang, Shilong, et al.
Published: (2024)
Benchmarking Machine Learning Applications on Heterogeneous Architecture using Reframe
by: Rae, Christopher, et al.
Published: (2024)
by: Rae, Christopher, et al.
Published: (2024)
Scaling MPI Applications on Aurora
by: Ibeid, Huda, et al.
Published: (2025)
by: Ibeid, Huda, et al.
Published: (2025)
Poplar: Efficient Scaling of Distributed DNN Training on Heterogeneous GPU Clusters
by: Zhang, WenZheng, et al.
Published: (2024)
by: Zhang, WenZheng, et al.
Published: (2024)
Formal and Empirical Study of Metadata-Based Profiling for Resource Management in the Computing Continuum
by: Morichetta, Andrea, et al.
Published: (2025)
by: Morichetta, Andrea, et al.
Published: (2025)
A Hybrid Communication Approach for Metadata Exchange in Geo-Distributed Fog Environments
by: Kruber, Marvin, et al.
Published: (2023)
by: Kruber, Marvin, et al.
Published: (2023)
Adaptable TeaStore
by: Bliudze, Simon, et al.
Published: (2024)
by: Bliudze, Simon, et al.
Published: (2024)
PerCache: Predictive Hierarchical Cache for RAG Applications on Mobile Devices
by: Liu, Kaiwei, et al.
Published: (2025)
by: Liu, Kaiwei, et al.
Published: (2025)
PrefillOnly: An Inference Engine for Prefill-only Workloads in Large Language Model Applications
by: Du, Kuntai, et al.
Published: (2025)
by: Du, Kuntai, et al.
Published: (2025)
HexiScale: Facilitating Large Language Model Training over Heterogeneous Hardware
by: Yan, Ran, et al.
Published: (2024)
by: Yan, Ran, et al.
Published: (2024)
StatuScale: Status-aware and Elastic Scaling Strategy for Microservice Applications
by: Wen, Linfeng, et al.
Published: (2024)
by: Wen, Linfeng, et al.
Published: (2024)
Dynamic Detection of Inefficient Data Mapping Patterns in Heterogeneous OpenMP Applications
by: Marzen, Luke, et al.
Published: (2026)
by: Marzen, Luke, et al.
Published: (2026)
Towards Affordable, Adaptive and Automatic GNN Training on CPU-GPU Heterogeneous Platforms
by: Qiao, Tong, et al.
Published: (2025)
by: Qiao, Tong, et al.
Published: (2025)
Distributed On-Device LLM Inference With Over-the-Air Computation
by: Zhang, Kai, et al.
Published: (2025)
by: Zhang, Kai, et al.
Published: (2025)
Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices
by: Shen, Tao, et al.
Published: (2025)
by: Shen, Tao, et al.
Published: (2025)
FlexFL: Heterogeneous Federated Learning via APoZ-Guided Flexible Pruning in Uncertain Scenarios
by: Chen, Zekai, et al.
Published: (2024)
by: Chen, Zekai, et al.
Published: (2024)
Clock2Q+: A Simple and Efficient Replacement Algorithm for Metadata Cache in VMware vSAN
by: Zhai, Yiyan, et al.
Published: (2025)
by: Zhai, Yiyan, et al.
Published: (2025)
Parallel Data Object Creation: Towards Scalable Metadata Management in High-Performance I/O Library
by: Li, Youjia, et al.
Published: (2025)
by: Li, Youjia, et al.
Published: (2025)
Similar Items
-
Exploring Uncore Frequency Scaling for Heterogeneous Computing
by: Zheng, Zhong, et al.
Published: (2025) -
Benchmarking Different Application Types across Heterogeneous Cloud Compute Services
by: Duggi, Nivedhitha, et al.
Published: (2025) -
MCFuser: High-Performance and Rapid Fusion of Memory-Bound Compute-Intensive Operators
by: Zhang, Zheng, et al.
Published: (2025) -
Enabling Disaggregated Multi-Stage MLLM Inference via GPU-Internal Scheduling and Resource Sharing
by: Zhao, Lingxiao, et al.
Published: (2025) -
MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism
by: Zhang, Zheng, et al.
Published: (2025)