Saved in:
| Main Authors: | Zhang, Zhiwei, Shen, Jiayu, Kumar, Niraj, Pistoia, Marco |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.00987 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
GPU-Parallelizable Randomized Sketch-and-Precondition for Linear Regression using Sparse Sign Sketches
by: Chen, Tyler, et al.
Published: (2025)
by: Chen, Tyler, et al.
Published: (2025)
Mangrove: Fast and Parallelizable State Replication for Blockchains
by: Paramonov, Anton, et al.
Published: (2025)
by: Paramonov, Anton, et al.
Published: (2025)
Privacy-preserving quantum federated learning via gradient hiding
by: Li, Changhao, et al.
Published: (2023)
by: Li, Changhao, et al.
Published: (2023)
Efficiently Parallelizable Strassen-Based Multiplication of a Matrix by its Transpose
by: Arrigoni, Viviana, et al.
Published: (2021)
by: Arrigoni, Viviana, et al.
Published: (2021)
Energy-Efficient Real-Time Job Mapping and Resource Management in Mobile-Edge Computing
by: Gao, Chuanchao, et al.
Published: (2025)
by: Gao, Chuanchao, et al.
Published: (2025)
Optimal Fixed Priority Scheduling in Multi-Stage Multi-Resource Distributed Real-Time Systems
by: Kumar, Niraj, et al.
Published: (2024)
by: Kumar, Niraj, et al.
Published: (2024)
Asymptotically Optimal Scheduling of Multiple Parallelizable Job Classes
by: Berg, Benjamin, et al.
Published: (2024)
by: Berg, Benjamin, et al.
Published: (2024)
Design of A Low-Latency and Parallelizable SVD Dataflow Architecture on FPGA
by: Du, Fangqiang, et al.
Published: (2025)
by: Du, Fangqiang, et al.
Published: (2025)
Parallel/Distributed Tabu Search for Scheduling Microprocessor Tasks in Hybrid Flowshop
by: Janiak, Adam, et al.
Published: (2025)
by: Janiak, Adam, et al.
Published: (2025)
More is Different: Prototyping and Analyzing a New Form of Edge Server with Massive Mobile SoCs
by: Zhang, Li, et al.
Published: (2022)
by: Zhang, Li, et al.
Published: (2022)
KubePACS: Kubernetes Cluster Using Performant, Highly Available, and Cost Efficient Spot Instances
by: Kim, Taeyoon, et al.
Published: (2026)
by: Kim, Taeyoon, et al.
Published: (2026)
Chopin: An Open Source R-language Tool to Support Spatial Analysis on Parallelizable Infrastructure
by: Song, Insang, et al.
Published: (2024)
by: Song, Insang, et al.
Published: (2024)
MaaSO: SLO-aware Orchestration of Heterogeneous Model Instances for MaaS
by: Xuan, Mo, et al.
Published: (2025)
by: Xuan, Mo, et al.
Published: (2025)
Matrix representation and GPU-optimized parallel B-spline computing
by: Wu, Jiayu, et al.
Published: (2025)
by: Wu, Jiayu, et al.
Published: (2025)
Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances
by: Duan, Jiangfei, et al.
Published: (2024)
by: Duan, Jiangfei, et al.
Published: (2024)
Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on Massive Edge Devices
by: Shen, Tao, et al.
Published: (2025)
by: Shen, Tao, et al.
Published: (2025)
APWA: A Distributed Architecture for Parallelizable Agentic Workflows
by: Rose, Evan, et al.
Published: (2026)
by: Rose, Evan, et al.
Published: (2026)
Optimal Workload Placement on Multi-Instance GPUs
by: Turkkan, Bekir, et al.
Published: (2024)
by: Turkkan, Bekir, et al.
Published: (2024)
Ding-Dong Ditch: Peeking Into Spot Instance Availability
by: Kim, Kyumin, et al.
Published: (2026)
by: Kim, Kyumin, et al.
Published: (2026)
Improving GPU Multi-Tenancy Through Dynamic Multi-Instance GPU Reconfiguration
by: Wang, Tianyu, et al.
Published: (2024)
by: Wang, Tianyu, et al.
Published: (2024)
A Survey of Distributed Graph Algorithms on Massive Graphs
by: Meng, Lingkai, et al.
Published: (2024)
by: Meng, Lingkai, et al.
Published: (2024)
Managing Multi Instance GPUs for High Throughput and Energy Savings
by: Saraha, Abhijeet, et al.
Published: (2025)
by: Saraha, Abhijeet, et al.
Published: (2025)
Incisor: Ex Ante Cloud Instance Selection for HPC Jobs
by: Laurenzano, Michael A., et al.
Published: (2026)
by: Laurenzano, Michael A., et al.
Published: (2026)
EcoServe: Enabling Cost-effective LLM Serving with Proactive Intra- and Inter-Instance Orchestration
by: Du, Jiangsu, et al.
Published: (2025)
by: Du, Jiangsu, et al.
Published: (2025)
Evaluating Multi-Instance DNN Inferencing on Multiple Accelerators of an Edge Device
by: Tayal, Mumuksh, et al.
Published: (2025)
by: Tayal, Mumuksh, et al.
Published: (2025)
Massively parallel CMA-ES with increasing population
by: Redon, David, et al.
Published: (2024)
by: Redon, David, et al.
Published: (2024)
Fantasy: Efficient Large-scale Vector Search on GPU Clusters with GPUDirect Async
by: Liu, Yi, et al.
Published: (2025)
by: Liu, Yi, et al.
Published: (2025)
Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections
by: Wagenländer, Marcel, et al.
Published: (2023)
by: Wagenländer, Marcel, et al.
Published: (2023)
GFS: A Preemption-aware Scheduling Framework for GPU Clusters with Predictive Spot Instance Management
by: Duan, Jiaang, et al.
Published: (2025)
by: Duan, Jiaang, et al.
Published: (2025)
Minos: Exploiting Cloud Performance Variation with Function-as-a-Service Instance Selection
by: Schirmer, Trever, et al.
Published: (2025)
by: Schirmer, Trever, et al.
Published: (2025)
Efficient Routing of Inference Requests across LLM Instances in Cloud-Edge Computing
by: Yu, Shibo, et al.
Published: (2025)
by: Yu, Shibo, et al.
Published: (2025)
HybridFlow: Resource-Adaptive Subtask Routing for Efficient Edge-Cloud LLM Inference
by: Dong, Jiangwen, et al.
Published: (2025)
by: Dong, Jiangwen, et al.
Published: (2025)
A New Perspective of Graph Data and A Generic and Efficient Method for Large Scale Graph Data Traversal
by: Zhang, Chenglong
Published: (2020)
by: Zhang, Chenglong
Published: (2020)
An Online Fragmentation-Aware Scheduler for Managing GPU-Sharing Workloads on Multi-Instance GPUs
by: Ting, Hsu-Tzu, et al.
Published: (2025)
by: Ting, Hsu-Tzu, et al.
Published: (2025)
Efficient Parallel Implementation of the Pilot Assignment Problem in Massive MIMO Systems
by: Alqudah, Eman, et al.
Published: (2025)
by: Alqudah, Eman, et al.
Published: (2025)
Split Fine-Tuning for Large Language Models in Wireless Networks
by: Zhang, Songge, et al.
Published: (2025)
by: Zhang, Songge, et al.
Published: (2025)
On the Partitioning of GPU Power among Multi-Instances
by: Vamja, Tirth, et al.
Published: (2025)
by: Vamja, Tirth, et al.
Published: (2025)
SpotVista: Availability-Aware Recommendation System for Reliable and Cost-Efficient Multi-Node Spot Instances
by: Kim, Taeyoon, et al.
Published: (2026)
by: Kim, Taeyoon, et al.
Published: (2026)
PilotANN: Memory-Bounded GPU Acceleration for Vector Search
by: Gui, Yuntao, et al.
Published: (2025)
by: Gui, Yuntao, et al.
Published: (2025)
SwarmSearch: Decentralized Search Engine with Self-Funding Economy
by: Gregoriadis, Marcel, et al.
Published: (2025)
by: Gregoriadis, Marcel, et al.
Published: (2025)
Similar Items
-
GPU-Parallelizable Randomized Sketch-and-Precondition for Linear Regression using Sparse Sign Sketches
by: Chen, Tyler, et al.
Published: (2025) -
Mangrove: Fast and Parallelizable State Replication for Blockchains
by: Paramonov, Anton, et al.
Published: (2025) -
Privacy-preserving quantum federated learning via gradient hiding
by: Li, Changhao, et al.
Published: (2023) -
Efficiently Parallelizable Strassen-Based Multiplication of a Matrix by its Transpose
by: Arrigoni, Viviana, et al.
Published: (2021) -
Energy-Efficient Real-Time Job Mapping and Resource Management in Mobile-Edge Computing
by: Gao, Chuanchao, et al.
Published: (2025)