:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Coleman, Tainã, Ahmed, Hena, Shende, Ravi, Perez, Ismael, Altintaş, Ïlkay
Format:	Preprint
Published:	2025
Subjects:	Distributed, Parallel, and Cluster Computing Artificial Intelligence
Online Access:	https://arxiv.org/abs/2506.13730
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Ksurf-Drone: Attention Kalman Filter for Contextual Bandit Optimization in Cloud Resource Allocation
by: Dang'ana, Michael, et al.
Published: (2025)

Hardware-Aware Reformulation of Convolutions for Efficient Execution on Specialized AI Hardware: A Case Study on NVIDIA Tensor Cores
by: Bikshandi, Ganesh
Published: (2026)

Online GPU Energy Optimization with Switching-Aware Bandits
by: Xu, Xiongxiao, et al.
Published: (2024)

The Case for Co-Designing Model Architectures with Hardware
by: Anthony, Quentin, et al.
Published: (2024)

SwizzlePerf: Hardware-Aware LLMs for GPU Kernel Performance Optimization
by: Tschand, Arya, et al.
Published: (2025)

Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
by: An, Wei, et al.
Published: (2024)

LLMServingSim2.0: A Unified Simulator for Heterogeneous Hardware and Serving Techniques in LLM Infrastructure
by: Cho, Jaehong, et al.
Published: (2025)

Hardware Utilization and Inference Performance of Edge Object Detection Under Fault Injection
by: Pasandideh, Faezeh, et al.
Published: (2026)

Viability and Performance of a Private LLM Server for SMBs: A Benchmark Analysis of Qwen3-30B on Consumer-Grade Hardware
by: Khalil, Alex, et al.
Published: (2025)

GWLZ: A Group-wise Learning-based Lossy Compression Framework for Scientific Data
by: Jia, Wenqi, et al.
Published: (2024)

MoE-Lens: Towards the Hardware Limit of High-Throughput MoE LLM Serving Under Resource Constraints
by: Yuan, Yichao, et al.
Published: (2025)

EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical Model
by: Pinnock, Alyssa, et al.
Published: (2025)

Cascading Bandits With Feedback
by: Prakash, R Sri, et al.
Published: (2025)

Deploying Atmospheric and Oceanic AI Models on Chinese Hardware and Framework: Migration Strategies, Performance Optimization and Analysis
by: Sun, Yuze, et al.
Published: (2025)

Resilient Byzantine Agreement with Predictions
by: Dallot, Julien, et al.
Published: (2026)

On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
by: Singh, Jaskirat, et al.
Published: (2024)

PRAGMA: A Profiling-Reasoned Multi-Agent Framework for Automatic Kernel Optimization
by: Lei, Kelun, et al.
Published: (2025)

ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks
by: Shi, Ziji, et al.
Published: (2024)

Cloudless-Training: A Framework to Improve Efficiency of Geo-Distributed ML Training
by: Tan, Wenting, et al.
Published: (2023)

FedDCT: A Dynamic Cross-Tier Federated Learning Framework in Wireless Networks
by: Xian, Youquan, et al.
Published: (2023)

AI Inference as Relocatable Electricity Demand: A Latency-Constrained Energy-Geography Framework
by: Luo, Xubin, et al.
Published: (2026)

Placement Semantics for Distributed Deep Learning: A Systematic Framework for Analyzing Parallelism Strategies
by: Mehta, Deep Pankajbhai
Published: (2026)

A Scheduling Framework for Efficient MoE Inference on Edge GPU-NDP Systems
by: Wu, Qi, et al.
Published: (2026)

KVComp: A High-Performance, LLM-Aware, Lossy Compression Framework for KV Cache
by: Jiang, Bo, et al.
Published: (2025)

Binary Bleed: Fast Distributed and Parallel Method for Automatic Model Selection
by: Barron, Ryan, et al.
Published: (2024)

A Parallel CPU-GPU Framework for Batching Heuristic Operations in Depth-First Heuristic Search
by: Futuhi, Ehsan, et al.
Published: (2025)

GPU-Virt-Bench: A Comprehensive Benchmarking Framework for Software-Based GPU Virtualization Systems
by: VG, Jithin, et al.
Published: (2025)

AEG: A Baremetal Framework for AI Acceleration via Direct Hardware Access in Heterogeneous Accelerators
by: Jiang, Hua, et al.
Published: (2026)

Artificial Intelligence for Cost-Aware Resource Prediction in Big Data Pipelines
by: Goyal, Harshit
Published: (2025)

Block: Balancing Load in LLM Serving with Context, Knowledge and Predictive Scheduling
by: Da, Wei, et al.
Published: (2025)

KAITIAN: A Unified Communication Framework for Enabling Efficient Collaboration Across Heterogeneous Accelerators in Embodied AI Systems
by: Lin, Jieke, et al.
Published: (2025)

A Nonlinear Hash-based Optimization Method for SpMV on GPUs
by: Yan, Chen, et al.
Published: (2025)

A Blockchain and Artificial Intelligence based System for Halal Food Traceability
by: Alourani, Abdulla, et al.
Published: (2024)

Towards Verifiable Federated Unlearning: Framework, Challenges, and The Road Ahead
by: Nguyen, Thanh Linh, et al.
Published: (2025)

A Multi-Armed Bandit-Based Participant Selection Method for Federated Recommendation Systems
by: Liu, Jintao, et al.
Published: (2025)

Accurate GPU Memory Prediction for Deep Learning Jobs through Dynamic Analysis
by: Shi, Jiabo, et al.
Published: (2025)

Using Sequential Runtime Distributions for the Parallel Speedup Prediction of SAT Local Search
by: Arbelaez, Alejandro, et al.
Published: (2024)

A Survey on Large Language Model Acceleration based on KV Cache Management
by: Li, Haoyang, et al.
Published: (2024)

Cooperative Cognitive Dynamic System in UAV Swarms: Reconfigurable Mechanism and Framework
by: Jia, Ziye, et al.
Published: (2024)

Towards Carbon-Aware Container Orchestration: Predicting Workload Energy Consumption with Federated Learning
by: Saad, Zainab, et al.
Published: (2025)