:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	AbouElhamayed, Ahmed F., Balle, Susanne, Singh, Deshanand, Abdelfattah, Mohamed S.
Format:	Preprint
Published:	2024
Subjects:	Distributed, Parallel, and Cluster Computing Artificial Intelligence Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2403.12981
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Scaling LLM Inference Beyond Amdahl`s Limits via Eliminating Non-Scalable Overheads
by: Zhao, Alan, et al.
Published: (2026)

Performance Characterization of Containerized DNN Training and Inference on Edge Accelerators
by: K., Prashanthi S., et al.
Published: (2023)

Online Optimization of DNN Inference Network Utility in Collaborative Edge Computing
by: Li, Rui, et al.
Published: (2024)

Experimental Analysis of Server-Side Caching for Web Performance
by: Umar, Mohammad, et al.
Published: (2026)

Experiences with Model Context Protocol Servers for Science and High Performance Computing
by: Pan, Haochen, et al.
Published: (2025)

Where to Split? A Pareto-Front Analysis of DNN Partitioning for Edge Inference
by: Masud, Adiba, et al.
Published: (2026)

Practical Performance Guarantees for Pipelined DNN Inference
by: Archer, Aaron, et al.
Published: (2023)

Fulcrum: Optimizing Concurrent DNN Training and Inferencing on Edge Accelerators
by: K., Prashanthi S., et al.
Published: (2025)

Modular Architecture for High-Performance and Low Overhead Data Transfers
by: Swargo, Rasman Mubtasim, et al.
Published: (2025)

PipeMax: Enhancing Offline LLM Inference on Commodity GPU Servers
by: Zhang, Hongbin, et al.
Published: (2026)

SlimEdge: Performance and Device Aware Distributed DNN Deployment on Resource-Constrained Edge Hardware
by: Kumar, Mahadev Sunil, et al.
Published: (2025)

Collaborative Inference in DNN-based Satellite Systems with Dynamic Task Streams
by: Guan, Jinglong, et al.
Published: (2023)

Evaluating Multi-Instance DNN Inferencing on Multiple Accelerators of an Edge Device
by: Tayal, Mumuksh, et al.
Published: (2025)

HarmonyBatch: Batching multi-SLO DNN Inference with Heterogeneous Serverless Functions
by: Chen, Jiabin, et al.
Published: (2024)

Adaptive Heuristics for Scheduling DNN Inferencing on Edge and Cloud for Personalized UAV Fleets
by: Raj, Suman, et al.
Published: (2024)

AdaOper: Energy-efficient and Responsive Concurrent DNN Inference on Mobile Devices
by: Lin, Zheng, et al.
Published: (2024)

DARIS: An Oversubscribed Spatio-Temporal Scheduler for Real-Time DNN Inference on GPUs
by: Babaei, Amir Fakhim, et al.
Published: (2025)

Why Should the Server Do It All?: A Scalable, Versatile, and Model-Agnostic Framework for Server-Light DNN Inference over Massively Distributed Clients via Training-Free Intermediate Feature Compression
by: Sung, Mingyu, et al.
Published: (2025)

Training DNN Models over Heterogeneous Clusters with Optimal Performance
by: Nie, Chengyi, et al.
Published: (2024)

From Servers to Sites: Compositional Power Trace Generation of LLM Inference for Infrastructure Planning
by: Wilkins, Grant, et al.
Published: (2026)

Collaborative Satellite Computing through Adaptive DNN Task Splitting and Offloading
by: Peng, Shifeng, et al.
Published: (2024)

Infer-EDGE: Dynamic DNN Inference Optimization in 'Just-in-time' Edge-AI Implementations
by: Mounesan, Motahare, et al.
Published: (2025)

Analysis of Server Throughput For Managed Big Data Analytics Frameworks
by: Anagnostakis, Emmanouil, et al.
Published: (2025)

ParvaGPU: Efficient Spatial GPU Sharing for Large-Scale DNN Inference in Cloud Environments
by: Lee, Munkyu, et al.
Published: (2024)

Adaptive Device-Edge Collaboration on DNN Inference in AIoT: A Digital Twin-Assisted Approach
by: Hu, Shisheng, et al.
Published: (2024)

Ecomap: Sustainability-Driven Optimization of Multi-Tenant DNN Execution on Edge Servers
by: Paramanayakam, Varatheepan, et al.
Published: (2025)

A Converting Autoencoder Toward Low-latency and Energy-efficient DNN Inference at the Edge
by: Mahmud, Hasanul, et al.
Published: (2024)

Are Bus-Mounted Edge Servers Feasible?
by: Li, Xuezhi, et al.
Published: (2025)

Preemption Aware Task Scheduling for Priority and Deadline Constrained DNN Inference Task Offloading in Homogeneous Mobile-Edge Networks
by: Cotter, Jamie, et al.
Published: (2025)

A Survey of End-to-End Modeling for Distributed DNN Training: Workloads, Simulators, and TCO
by: Svedas, Jonas, et al.
Published: (2025)

A Survey on Collaborative DNN Inference for Edge Intelligence
by: Ren, Weiqing, et al.
Published: (2022)

Practical Federated Learning without a Server
by: Dhasade, Akash, et al.
Published: (2025)

AdaBridge: Dynamic Data and Computation Reuse for Efficient Multi-task DNN Co-evolution in Edge Systems
by: Wang, Lehao, et al.
Published: (2024)

Enabling Large Batch Size Training for DNN Models Beyond the Memory Limit While Maintaining Performance
by: Piao, XinYu, et al.
Published: (2021)

Modern Computing: Vision and Challenges
by: Gill, Sukhpal Singh, et al.
Published: (2024)

Checkmate: Zero-Overhead Model Checkpointing via Network Gradient Replication
by: Bhardwaj, Ankit, et al.
Published: (2025)

AMSP: Reducing Communication Overhead of ZeRO for Efficient LLM Training
by: Chen, Qiaoling, et al.
Published: (2023)

KaMPIng: Flexible and (Near) Zero-Overhead C++ Bindings for MPI
by: Uhl, Tim Niklas, et al.
Published: (2024)

SCARIF: Towards Carbon Modeling of Cloud Servers with Accelerators
by: Ji, Shixin, et al.
Published: (2024)

Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUs
by: Chen, Aodong, et al.
Published: (2023)