Saved in:
| Main Authors: | Gencer, Emir, Issa, Mohammad Kefah Taha, Turimbetov, Ilyas, Trotter, James D., Unat, Didem |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.19084 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The Landscape of GPU-Centric Communication
by: Unat, Didem, et al.
Published: (2024)
by: Unat, Didem, et al.
Published: (2024)
Balanced and Elastic End-to-end Training of Dynamic LLMs
by: Wahib, Mohamed, et al.
Published: (2025)
by: Wahib, Mohamed, et al.
Published: (2025)
Leveraging HPC Profiling & Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations
by: Williams, Jeremy J., et al.
Published: (2023)
by: Williams, Jeremy J., et al.
Published: (2023)
Exploring the Emerging Technologies within the Blockchain Landscape
by: Tareq, Mohammad Ali, et al.
Published: (2024)
by: Tareq, Mohammad Ali, et al.
Published: (2024)
LLMTailor: A Layer-wise Tailoring Tool for Efficient Checkpointing of Large Language Models
by: Sun, Minqiu, et al.
Published: (2026)
by: Sun, Minqiu, et al.
Published: (2026)
Orchestrated Co-scheduling, Resource Partitioning, and Power Capping on CPU-GPU Heterogeneous Systems via Machine Learning
by: Saba, Issa, et al.
Published: (2024)
by: Saba, Issa, et al.
Published: (2024)
Communication-Efficient Model Aggregation with Layer Divergence Feedback in Federated Learning
by: Wang, Liwei, et al.
Published: (2024)
by: Wang, Liwei, et al.
Published: (2024)
Multi-level Memory-Centric Profiling on ARM Processors with ARM SPE
by: Miksits, Samuel, et al.
Published: (2024)
by: Miksits, Samuel, et al.
Published: (2024)
Multi-Layer Scheduling for MoE-Based LLM Reasoning
by: Sun, Yifan, et al.
Published: (2026)
by: Sun, Yifan, et al.
Published: (2026)
A Scalable Multi-Layered Blockchain Architecture for Enhanced EHR Sharing and Drug Supply Chain Management
by: Javan, Reza, et al.
Published: (2024)
by: Javan, Reza, et al.
Published: (2024)
Optimizing Hardware Resource Partitioning and Job Allocations on Modern GPUs under Power Caps
by: Arima, Eishi, et al.
Published: (2024)
by: Arima, Eishi, et al.
Published: (2024)
Visualizing Distributed Traces in Aggregate
by: Samanta, Adrita, et al.
Published: (2024)
by: Samanta, Adrita, et al.
Published: (2024)
Eliminating Hidden Serialization in Multi-Node Megakernel Communication
by: Oh, Byungsoo, et al.
Published: (2026)
by: Oh, Byungsoo, et al.
Published: (2026)
Memory-Centric Computing: Solving Computing's Memory Problem
by: Mutlu, Onur, et al.
Published: (2025)
by: Mutlu, Onur, et al.
Published: (2025)
Automatic Tracing in Task-Based Runtime Systems
by: Yadav, Rohan, et al.
Published: (2024)
by: Yadav, Rohan, et al.
Published: (2024)
Tracing Distributed Algorithms Using Replay Clocks
by: Lagwankar, Ishaan
Published: (2024)
by: Lagwankar, Ishaan
Published: (2024)
A Multi-Armed Bandit-Based Participant Selection Method for Federated Recommendation Systems
by: Liu, Jintao, et al.
Published: (2025)
by: Liu, Jintao, et al.
Published: (2025)
Communication Lower Bounds and Algorithms for Sketching with Random Dense Matrices
by: Daas, Hussam Al, et al.
Published: (2026)
by: Daas, Hussam Al, et al.
Published: (2026)
Energy Efficient Federated Learning with Hyperdimensional Computing over Wireless Communication Networks
by: Ding, Yahao, et al.
Published: (2026)
by: Ding, Yahao, et al.
Published: (2026)
Power-Aware Scheduling for Multi-Center HPC Electricity Cost Optimization
by: Hossain, Abrar, et al.
Published: (2025)
by: Hossain, Abrar, et al.
Published: (2025)
Datacenter Energy Optimized Power Profiles
by: Narayanaswamy, Sreedhar, et al.
Published: (2025)
by: Narayanaswamy, Sreedhar, et al.
Published: (2025)
Cloud Revolution: Tracing the Origins and Rise of Cloud Computing
by: Gurung, Deepa, et al.
Published: (2025)
by: Gurung, Deepa, et al.
Published: (2025)
Ares II: Tracing the Flaws of a (Storage) God
by: Georgiou, Chryssis, et al.
Published: (2024)
by: Georgiou, Chryssis, et al.
Published: (2024)
From Attention to Disaggregation: Tracing the Evolution of LLM Inference
by: Kumar, Madabattula Rajesh, et al.
Published: (2025)
by: Kumar, Madabattula Rajesh, et al.
Published: (2025)
Distributed Generative Inference of LLM at Internet Scales with Multi-Dimensional Communication Optimization
by: Chen, Jiu, et al.
Published: (2026)
by: Chen, Jiu, et al.
Published: (2026)
FedAPTA: Federated Multi-task Learning for Heterogeneous Devices with Adaptive Layer-wise Pruning and Task-aware Aggregation
by: Yu, Zhen, et al.
Published: (2025)
by: Yu, Zhen, et al.
Published: (2025)
A Knowledge Distillation-empowered Adaptive Federated Reinforcement Learning Framework for Multi-Domain IoT Applications Scheduling
by: Wang, Zhiyu, et al.
Published: (2025)
by: Wang, Zhiyu, et al.
Published: (2025)
Multi-Factor Trust-Driven Secure Communication Model for Cloud-Based Digital Twins
by: Saxena, Deepika, et al.
Published: (2026)
by: Saxena, Deepika, et al.
Published: (2026)
Ray Tracing Cores for General-Purpose Computing: A Literature Review
by: Meneses, Enzo, et al.
Published: (2026)
by: Meneses, Enzo, et al.
Published: (2026)
Mycroft: Tracing Dependencies in Collective Communication Towards Reliable LLM Training
by: Deng, Yangtao, et al.
Published: (2025)
by: Deng, Yangtao, et al.
Published: (2025)
Efficient Serverless Cold Start: Reducing Library Loading Overhead by Profile-guided Optimization
by: Tariq, Syed Salauddin Mohammad, et al.
Published: (2025)
by: Tariq, Syed Salauddin Mohammad, et al.
Published: (2025)
Accelerating Intra-Node GPU-to-GPU Communication Through Multi-Path Transfers with CUDA Graphs
by: Sojoodi, Amirhossein, et al.
Published: (2026)
by: Sojoodi, Amirhossein, et al.
Published: (2026)
Trace Replay Simulation of MIT SuperCloud for Studying Optimal Sustainability Policies
by: Brewer, Wesley, et al.
Published: (2025)
by: Brewer, Wesley, et al.
Published: (2025)
HYDRA: Breaking the Global Ordering Barrier in Multi-BFT Consensus
by: Lyu, Hanzheng, et al.
Published: (2025)
by: Lyu, Hanzheng, et al.
Published: (2025)
Efficient Parallel Compilation and Profiling of Quantum Circuits at Large Scales
by: Moore, Jane, et al.
Published: (2026)
by: Moore, Jane, et al.
Published: (2026)
Self-adaptive, Requirements-driven Autoscaling of Microservices
by: Nunes, João Paulo Karol Santos, et al.
Published: (2024)
by: Nunes, João Paulo Karol Santos, et al.
Published: (2024)
Syncopate: Efficient Multi-GPU AI Kernels via Automatic Chunk-Centric Compute-Communication Overlap
by: Qiang, Xinwei, et al.
Published: (2026)
by: Qiang, Xinwei, et al.
Published: (2026)
Memory-Centric Computing: Recent Advances in Processing-in-DRAM
by: Mutlu, Onur, et al.
Published: (2024)
by: Mutlu, Onur, et al.
Published: (2024)
From Servers to Sites: Compositional Power Trace Generation of LLM Inference for Infrastructure Planning
by: Wilkins, Grant, et al.
Published: (2026)
by: Wilkins, Grant, et al.
Published: (2026)
Trace-based, time-resolved analysis of MPI application performance using standard metrics
by: Haldar, Kingshuk
Published: (2025)
by: Haldar, Kingshuk
Published: (2025)
Similar Items
-
The Landscape of GPU-Centric Communication
by: Unat, Didem, et al.
Published: (2024) -
Balanced and Elastic End-to-end Training of Dynamic LLMs
by: Wahib, Mohamed, et al.
Published: (2025) -
Leveraging HPC Profiling & Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations
by: Williams, Jeremy J., et al.
Published: (2023) -
Exploring the Emerging Technologies within the Blockchain Landscape
by: Tareq, Mohammad Ali, et al.
Published: (2024) -
LLMTailor: A Layer-wise Tailoring Tool for Efficient Checkpointing of Large Language Models
by: Sun, Minqiu, et al.
Published: (2026)