Saved in:
| Main Authors: | Han, Haozhi, Zhang, Ruge, Chen, Haoquan, Chen, Yifeng, Jia, Haipeng, Yuan, Liang, Zhang, Yunquan, Cao, Ting, Liu, Yunxin, Zhang, Ya-Qin, Li, Kun |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.24091 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Efficient Remote KV Cache Reuse with GPU-native Video Codec
by: Mi, Liang, et al.
Published: (2026)
by: Mi, Liang, et al.
Published: (2026)
Asynch-SGBDT: Asynchronous Parallel Stochastic Gradient Boosting Decision Tree based on Parameters Server
by: Daning, Cheng, et al.
Published: (2018)
by: Daning, Cheng, et al.
Published: (2018)
AutoTSMM: An Auto-tuning Framework for Building High-Performance Tall-and-Skinny Matrix-Matrix Multiplication on CPUs
by: Li, Chendi, et al.
Published: (2022)
by: Li, Chendi, et al.
Published: (2022)
Stencil Matrixization
by: Zhao, Wenxuan, et al.
Published: (2023)
by: Zhao, Wenxuan, et al.
Published: (2023)
FaasMeter: Energy-First Serverless Computing
by: Rehman, Abdul, et al.
Published: (2024)
by: Rehman, Abdul, et al.
Published: (2024)
Efficient Precoding in XL-MIMO-AFDM System
by: Zhu, Jun, et al.
Published: (2025)
by: Zhu, Jun, et al.
Published: (2025)
Integrated Sensing, Communication, and Computing: An Information-oriented Resource Transaction Mechanism
by: Chen, Ning, et al.
Published: (2024)
by: Chen, Ning, et al.
Published: (2024)
Vec-LUT: Vector Table Lookup for Parallel Ultra-Low-Bit LLM Inference on Edge Devices
by: Li, Xiangyu, et al.
Published: (2025)
by: Li, Xiangyu, et al.
Published: (2025)
SmartWatts: Self-Calibrating Software-Defined Power Meter for Containers
by: Fieni, Guillaume, et al.
Published: (2020)
by: Fieni, Guillaume, et al.
Published: (2020)
Joint Resource Optimization, Computation Offloading and Resource Slicing for Multi-Edge Traffic-Cognitive Networks
by: Xiaoyang, Ting, et al.
Published: (2024)
by: Xiaoyang, Ting, et al.
Published: (2024)
DynaTrain: Fast Online Parallelism Switching for Elastic LLM Training
by: Wang, Yuanqing, et al.
Published: (2026)
by: Wang, Yuanqing, et al.
Published: (2026)
DIT: Dimension Reduction View on Optimal NFT Rarity Meters
by: Belousov, Dmitry, et al.
Published: (2025)
by: Belousov, Dmitry, et al.
Published: (2025)
DFPL: Decentralized Federated Prototype Learning Across Heterogeneous Data Distributions
by: Zhang, Hongliang, et al.
Published: (2025)
by: Zhang, Hongliang, et al.
Published: (2025)
Towards Lock Modularization for Heterogeneous Environments
by: Zhang, Hanze, et al.
Published: (2025)
by: Zhang, Hanze, et al.
Published: (2025)
Domain-Adaptive Model Merging Across Disconnected Modes
by: Liu, Junming, et al.
Published: (2026)
by: Liu, Junming, et al.
Published: (2026)
Back to the Future: Rethinking Endorsement in Order-Execute Blockchains
by: Huang, Rongji, et al.
Published: (2026)
by: Huang, Rongji, et al.
Published: (2026)
P-TimeSync: A Precise Time Synchronization Simulation with Network Propagation Delays
by: Dai, Wei, et al.
Published: (2024)
by: Dai, Wei, et al.
Published: (2024)
DWM-RO: Decentralized World Models with Reasoning Offloading for SWIPT-enabled Satellite-Terrestrial HetNets
by: Liu, Guangyuan, et al.
Published: (2025)
by: Liu, Guangyuan, et al.
Published: (2025)
Breaking the Aggregation Bottleneck in Federated Recommendation: A Personalized Model Merging Approach
by: Chen, Jundong, et al.
Published: (2025)
by: Chen, Jundong, et al.
Published: (2025)
SkyStore: Cost-Optimized Object Storage Across Regions and Clouds
by: Liu, Shu, et al.
Published: (2025)
by: Liu, Shu, et al.
Published: (2025)
AMSP: Reducing Communication Overhead of ZeRO for Efficient LLM Training
by: Chen, Qiaoling, et al.
Published: (2023)
by: Chen, Qiaoling, et al.
Published: (2023)
DecLock: A Case of Decoupled Locking for Disaggregated Memory
by: Zhang, Hanze, et al.
Published: (2025)
by: Zhang, Hanze, et al.
Published: (2025)
DiFache: Efficient and Scalable Caching on Disaggregated Memory using Decentralized Coherence
by: Zhang, Hanze, et al.
Published: (2025)
by: Zhang, Hanze, et al.
Published: (2025)
KIS-S: A GPU-Aware Kubernetes Inference Simulator with RL-Based Auto-Scaling
by: Zhang, Guilin, et al.
Published: (2025)
by: Zhang, Guilin, et al.
Published: (2025)
InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding
by: Chen, Qiaoling, et al.
Published: (2024)
by: Chen, Qiaoling, et al.
Published: (2024)
TrEnv-X: Transparently Share Serverless Execution Environments Across Different Functions and Nodes
by: Huang, Jialiang, et al.
Published: (2025)
by: Huang, Jialiang, et al.
Published: (2025)
Efficient Parallel Reinforcement Learning Framework using the Reactor Model
by: Kwok, Jacky, et al.
Published: (2023)
by: Kwok, Jacky, et al.
Published: (2023)
A Fully GPU-Accelerated Framework for High-Performance Configuration Interaction Selection with Neural Network Quantum States
by: Sun, Daran, et al.
Published: (2026)
by: Sun, Daran, et al.
Published: (2026)
Intelligent Model Update Strategy for Sequential Recommendation
by: Lv, Zheqi, et al.
Published: (2023)
by: Lv, Zheqi, et al.
Published: (2023)
MOSS: A Large-scale Open Microscopic Traffic Simulation System
by: Zhang, Jun, et al.
Published: (2024)
by: Zhang, Jun, et al.
Published: (2024)
Matryoshka: Optimization of Dynamic Diverse Quantum Chemistry Systems via Elastic Parallelism Transformation
by: Wang, Tuowei, et al.
Published: (2024)
by: Wang, Tuowei, et al.
Published: (2024)
HopGNN: Boosting Distributed GNN Training Efficiency via Feature-Centric Model Migration
by: Chen, Weijian, et al.
Published: (2024)
by: Chen, Weijian, et al.
Published: (2024)
Communication-Computation Pipeline Parallel Split Learning over Wireless Edge Networks
by: Liu, Chenyu, et al.
Published: (2025)
by: Liu, Chenyu, et al.
Published: (2025)
RServe: Overlapping Encoding and Prefill for Efficient LMM Inference
by: Guo, Tianyu, et al.
Published: (2025)
by: Guo, Tianyu, et al.
Published: (2025)
Boosting LLM Serving through Spatial-Temporal GPU Resource Sharing
by: Lin, Zejia, et al.
Published: (2025)
by: Lin, Zejia, et al.
Published: (2025)
AlignedServe: Orchestrating Prefix-aware Batching to Build a High-throughput and Computing-efficient LLM Serving System
by: Bai, Fengyao, et al.
Published: (2026)
by: Bai, Fengyao, et al.
Published: (2026)
An Efficient Subspace Algorithm for Federated Learning on Heterogeneous Data
by: Zhang, Jiaojiao, et al.
Published: (2025)
by: Zhang, Jiaojiao, et al.
Published: (2025)
SECO: Secure Inference With Model Splitting Across Multi-Server Hierarchy
by: Chen, Shuangyi, et al.
Published: (2024)
by: Chen, Shuangyi, et al.
Published: (2024)
A Theory of Multi-Agent Generative Flow Networks
by: Brunswic, Leo Maxime, et al.
Published: (2025)
by: Brunswic, Leo Maxime, et al.
Published: (2025)
Distributed Simulation for Digital Twins of Large-Scale Real-World DiffServ-Based Networks
by: Huang, Zhuoyao, et al.
Published: (2024)
by: Huang, Zhuoyao, et al.
Published: (2024)
Similar Items
-
Efficient Remote KV Cache Reuse with GPU-native Video Codec
by: Mi, Liang, et al.
Published: (2026) -
Asynch-SGBDT: Asynchronous Parallel Stochastic Gradient Boosting Decision Tree based on Parameters Server
by: Daning, Cheng, et al.
Published: (2018) -
AutoTSMM: An Auto-tuning Framework for Building High-Performance Tall-and-Skinny Matrix-Matrix Multiplication on CPUs
by: Li, Chendi, et al.
Published: (2022) -
Stencil Matrixization
by: Zhao, Wenxuan, et al.
Published: (2023) -
FaasMeter: Energy-First Serverless Computing
by: Rehman, Abdul, et al.
Published: (2024)