Saved in:
| Main Authors: | Yang, Yuting, Merlina, Andrea, Song, Weijia, Yuan, Tiancheng, Birman, Ken, Vitenberg, Roman |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.17652 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Keep Your Friends Close: Leveraging Affinity Groups to Accelerate AI Inference Workflows
by: Garrett, Thiago, et al.
Published: (2023)
by: Garrett, Thiago, et al.
Published: (2023)
On Replacing Cryptopuzzles with Useful Computation in Blockchain Proof-of-Work Protocols
by: Merlina, Andrea, et al.
Published: (2024)
by: Merlina, Andrea, et al.
Published: (2024)
Speculative Decoding in Decentralized LLM Inference: Turning Communication Latency into Computation Throughput
by: Song, Jingwei, et al.
Published: (2025)
by: Song, Jingwei, et al.
Published: (2025)
Compass: Optimizing Compound AI Workflows for Dynamic Adaptation
by: Gravara, Milos, et al.
Published: (2026)
by: Gravara, Milos, et al.
Published: (2026)
A Discussion about Computational Challenges of Programmable Money in Blockchain-based CBDCs
by: da Conceição, Arlindo F., et al.
Published: (2024)
by: da Conceição, Arlindo F., et al.
Published: (2024)
Scalable Runtime Architecture for Data-driven, Hybrid HPC and ML Workflow Applications
by: Merzky, Andre, et al.
Published: (2025)
by: Merzky, Andre, et al.
Published: (2025)
Reinforcement Learning-driven Data-intensive Workflow Scheduling for Volunteer Edge-Cloud
by: Mounesan, Motahare, et al.
Published: (2024)
by: Mounesan, Motahare, et al.
Published: (2024)
WORKSWORLD: A Domain for Integrated Numeric Planning and Scheduling of Distributed Pipelined Workflows
by: Paul, Taylor, et al.
Published: (2026)
by: Paul, Taylor, et al.
Published: (2026)
Reconstruction-Based Adaptive Scheduling Using AI Inferences in Safety-Critical Systems
by: Alshaer, Samer, et al.
Published: (2025)
by: Alshaer, Samer, et al.
Published: (2025)
Duration-Informed Workload Scheduler
by: Loreti, Daniela, et al.
Published: (2026)
by: Loreti, Daniela, et al.
Published: (2026)
Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems
by: Sgambati, Matthew, et al.
Published: (2025)
by: Sgambati, Matthew, et al.
Published: (2025)
Mixture-of-Schedulers: An Adaptive Scheduling Agent as a Learned Router for Expert Policies
by: Wang, Xinbo, et al.
Published: (2025)
by: Wang, Xinbo, et al.
Published: (2025)
AI Inference as Relocatable Electricity Demand: A Latency-Constrained Energy-Geography Framework
by: Luo, Xubin, et al.
Published: (2026)
by: Luo, Xubin, et al.
Published: (2026)
Prompt-Aware Scheduling for Low-Latency LLM Serving
by: Tao, Yiheng, et al.
Published: (2025)
by: Tao, Yiheng, et al.
Published: (2025)
DeServe: Towards Affordable Offline LLM Inference via Decentralization
by: Wu, Linyu, et al.
Published: (2025)
by: Wu, Linyu, et al.
Published: (2025)
LegoDiffusion: Micro-Serving Text-to-Image Diffusion Workflows
by: Yang, Lingyun, et al.
Published: (2026)
by: Yang, Lingyun, et al.
Published: (2026)
Towards Multi-Model LLM Schedulers: Empirical Insights into Offloading and Preemption
by: Yildiz, Mert, et al.
Published: (2026)
by: Yildiz, Mert, et al.
Published: (2026)
FlowKV: A Disaggregated Inference Framework with Low-Latency KV Cache Transfer and Load-Aware Scheduling
by: Li, Weiqing, et al.
Published: (2025)
by: Li, Weiqing, et al.
Published: (2025)
TRAIL: Trust-Aware Client Scheduling for Semi-Decentralized Federated Learning
by: Hu, Gangqiang, et al.
Published: (2024)
by: Hu, Gangqiang, et al.
Published: (2024)
ELANA: A Simple Energy and Latency Analyzer for LLMs
by: Chiang, Hung-Yueh, et al.
Published: (2025)
by: Chiang, Hung-Yueh, et al.
Published: (2025)
Optimizing Split Learning Latency in TinyML-Based IoT Systems
by: Jenhani, Zied, et al.
Published: (2025)
by: Jenhani, Zied, et al.
Published: (2025)
Large Language Model Partitioning for Low-Latency Inference at the Edge
by: Kafetzis, Dimitrios, et al.
Published: (2025)
by: Kafetzis, Dimitrios, et al.
Published: (2025)
Topology-aware Preemptive Scheduling for Co-located LLM Workloads
by: Zhang, Ping, et al.
Published: (2024)
by: Zhang, Ping, et al.
Published: (2024)
CoRaiS: Lightweight Real-Time Scheduler for Multi-Edge Cooperative Computing
by: Hu, Yujiao, et al.
Published: (2024)
by: Hu, Yujiao, et al.
Published: (2024)
D$^{2}$MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving
by: Wang, Haodong, et al.
Published: (2025)
by: Wang, Haodong, et al.
Published: (2025)
A Scheduling Framework for Efficient MoE Inference on Edge GPU-NDP Systems
by: Wu, Qi, et al.
Published: (2026)
by: Wu, Qi, et al.
Published: (2026)
Latency-Aware 2-Opt Monotonic Local Search for Distributed Constraint Optimization
by: Rachmut, Ben, et al.
Published: (2025)
by: Rachmut, Ben, et al.
Published: (2025)
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
by: Singh, Jaskirat, et al.
Published: (2024)
by: Singh, Jaskirat, et al.
Published: (2024)
StreamServe: Adaptive Speculative Flows for Low-Latency Disaggregated LLM Serving
by: Kumar, Satyam, et al.
Published: (2026)
by: Kumar, Satyam, et al.
Published: (2026)
Boosting Asynchronous Decentralized Learning with Model Fragmentation
by: Biswas, Sayan, et al.
Published: (2024)
by: Biswas, Sayan, et al.
Published: (2024)
Byzantine-Robust Decentralized Coordination of LLM Agents
by: Jo, Yongrae, et al.
Published: (2025)
by: Jo, Yongrae, et al.
Published: (2025)
Scepsy: Serving Agentic Workflows Using Aggregate LLM Pipelines
by: Wagenländer, Marcel, et al.
Published: (2026)
by: Wagenländer, Marcel, et al.
Published: (2026)
Workload Schedulers -- Genesis, Algorithms and Differences
by: Sliwko, Leszek, et al.
Published: (2025)
by: Sliwko, Leszek, et al.
Published: (2025)
CubicML: Automated ML for Large ML Systems Co-design with ML Prediction of Performance
by: Wen, Wei, et al.
Published: (2024)
by: Wen, Wei, et al.
Published: (2024)
Sentinel: An Aggregation Function to Secure Decentralized Federated Learning
by: Feng, Chao, et al.
Published: (2023)
by: Feng, Chao, et al.
Published: (2023)
Decentralized AI: Permissionless LLM Inference on POKT Network
by: Olshansky, Daniel, et al.
Published: (2024)
by: Olshansky, Daniel, et al.
Published: (2024)
UnifyFL: Enabling Decentralized Cross-Silo Federated Learning
by: S, Sarang, et al.
Published: (2025)
by: S, Sarang, et al.
Published: (2025)
The (R)evolution of Scientific Workflows in the Agentic AI Era: Towards Autonomous Science
by: Shin, Woong, et al.
Published: (2025)
by: Shin, Woong, et al.
Published: (2025)
Electricity Cost Minimization for Multi-Workflow Allocation in Geo-Distributed Data Centers
by: Wang, Shuang, et al.
Published: (2025)
by: Wang, Shuang, et al.
Published: (2025)
Accelerating Latency-Critical Applications with AI-Powered Semi-Automatic Fine-Grained Parallelization on SMT Processors
by: Los, Denis, et al.
Published: (2025)
by: Los, Denis, et al.
Published: (2025)
Similar Items
-
Keep Your Friends Close: Leveraging Affinity Groups to Accelerate AI Inference Workflows
by: Garrett, Thiago, et al.
Published: (2023) -
On Replacing Cryptopuzzles with Useful Computation in Blockchain Proof-of-Work Protocols
by: Merlina, Andrea, et al.
Published: (2024) -
Speculative Decoding in Decentralized LLM Inference: Turning Communication Latency into Computation Throughput
by: Song, Jingwei, et al.
Published: (2025) -
Compass: Optimizing Compound AI Workflows for Dynamic Adaptation
by: Gravara, Milos, et al.
Published: (2026) -
A Discussion about Computational Challenges of Programmable Money in Blockchain-based CBDCs
by: da Conceição, Arlindo F., et al.
Published: (2024)