:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Yuting, Merlina, Andrea, Song, Weijia, Yuan, Tiancheng, Birman, Ken, Vitenberg, Roman
Format:	Preprint
Published:	2024
Subjects:	Distributed, Parallel, and Cluster Computing Artificial Intelligence
Online Access:	https://arxiv.org/abs/2402.17652
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Keep Your Friends Close: Leveraging Affinity Groups to Accelerate AI Inference Workflows
by: Garrett, Thiago, et al.
Published: (2023)

On Replacing Cryptopuzzles with Useful Computation in Blockchain Proof-of-Work Protocols
by: Merlina, Andrea, et al.
Published: (2024)

Speculative Decoding in Decentralized LLM Inference: Turning Communication Latency into Computation Throughput
by: Song, Jingwei, et al.
Published: (2025)

Compass: Optimizing Compound AI Workflows for Dynamic Adaptation
by: Gravara, Milos, et al.
Published: (2026)

A Discussion about Computational Challenges of Programmable Money in Blockchain-based CBDCs
by: da Conceição, Arlindo F., et al.
Published: (2024)

Scalable Runtime Architecture for Data-driven, Hybrid HPC and ML Workflow Applications
by: Merzky, Andre, et al.
Published: (2025)

Reinforcement Learning-driven Data-intensive Workflow Scheduling for Volunteer Edge-Cloud
by: Mounesan, Motahare, et al.
Published: (2024)

WORKSWORLD: A Domain for Integrated Numeric Planning and Scheduling of Distributed Pipelined Workflows
by: Paul, Taylor, et al.
Published: (2026)

Reconstruction-Based Adaptive Scheduling Using AI Inferences in Safety-Critical Systems
by: Alshaer, Samer, et al.
Published: (2025)

Duration-Informed Workload Scheduler
by: Loreti, Daniela, et al.
Published: (2026)

Decentralized Distributed Proximal Policy Optimization (DD-PPO) for High Performance Computing Scheduling on Multi-User Systems
by: Sgambati, Matthew, et al.
Published: (2025)

Mixture-of-Schedulers: An Adaptive Scheduling Agent as a Learned Router for Expert Policies
by: Wang, Xinbo, et al.
Published: (2025)

AI Inference as Relocatable Electricity Demand: A Latency-Constrained Energy-Geography Framework
by: Luo, Xubin, et al.
Published: (2026)

Prompt-Aware Scheduling for Low-Latency LLM Serving
by: Tao, Yiheng, et al.
Published: (2025)

DeServe: Towards Affordable Offline LLM Inference via Decentralization
by: Wu, Linyu, et al.
Published: (2025)

LegoDiffusion: Micro-Serving Text-to-Image Diffusion Workflows
by: Yang, Lingyun, et al.
Published: (2026)

Towards Multi-Model LLM Schedulers: Empirical Insights into Offloading and Preemption
by: Yildiz, Mert, et al.
Published: (2026)

FlowKV: A Disaggregated Inference Framework with Low-Latency KV Cache Transfer and Load-Aware Scheduling
by: Li, Weiqing, et al.
Published: (2025)

TRAIL: Trust-Aware Client Scheduling for Semi-Decentralized Federated Learning
by: Hu, Gangqiang, et al.
Published: (2024)

ELANA: A Simple Energy and Latency Analyzer for LLMs
by: Chiang, Hung-Yueh, et al.
Published: (2025)

Optimizing Split Learning Latency in TinyML-Based IoT Systems
by: Jenhani, Zied, et al.
Published: (2025)

Large Language Model Partitioning for Low-Latency Inference at the Edge
by: Kafetzis, Dimitrios, et al.
Published: (2025)

Topology-aware Preemptive Scheduling for Co-located LLM Workloads
by: Zhang, Ping, et al.
Published: (2024)

CoRaiS: Lightweight Real-Time Scheduler for Multi-Edge Cooperative Computing
by: Hu, Yujiao, et al.
Published: (2024)

D$^{2}$MoE: Dual Routing and Dynamic Scheduling for Efficient On-Device MoE-based LLM Serving
by: Wang, Haodong, et al.
Published: (2025)

A Scheduling Framework for Efficient MoE Inference on Edge GPU-NDP Systems
by: Wu, Qi, et al.
Published: (2026)

Latency-Aware 2-Opt Monotonic Local Search for Distributed Constraint Optimization
by: Rachmut, Ben, et al.
Published: (2025)

On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
by: Singh, Jaskirat, et al.
Published: (2024)

StreamServe: Adaptive Speculative Flows for Low-Latency Disaggregated LLM Serving
by: Kumar, Satyam, et al.
Published: (2026)

Boosting Asynchronous Decentralized Learning with Model Fragmentation
by: Biswas, Sayan, et al.
Published: (2024)

Byzantine-Robust Decentralized Coordination of LLM Agents
by: Jo, Yongrae, et al.
Published: (2025)

Scepsy: Serving Agentic Workflows Using Aggregate LLM Pipelines
by: Wagenländer, Marcel, et al.
Published: (2026)

Workload Schedulers -- Genesis, Algorithms and Differences
by: Sliwko, Leszek, et al.
Published: (2025)

CubicML: Automated ML for Large ML Systems Co-design with ML Prediction of Performance
by: Wen, Wei, et al.
Published: (2024)

Sentinel: An Aggregation Function to Secure Decentralized Federated Learning
by: Feng, Chao, et al.
Published: (2023)

Decentralized AI: Permissionless LLM Inference on POKT Network
by: Olshansky, Daniel, et al.
Published: (2024)

UnifyFL: Enabling Decentralized Cross-Silo Federated Learning
by: S, Sarang, et al.
Published: (2025)

The (R)evolution of Scientific Workflows in the Agentic AI Era: Towards Autonomous Science
by: Shin, Woong, et al.
Published: (2025)

Electricity Cost Minimization for Multi-Workflow Allocation in Geo-Distributed Data Centers
by: Wang, Shuang, et al.
Published: (2025)

Accelerating Latency-Critical Applications with AI-Powered Semi-Automatic Fine-Grained Parallelization on SMT Processors
by: Los, Denis, et al.
Published: (2025)