:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Prabhu, Ritvik, Vatai, Emil, Moussad, Bernard, Jeannot, Emmanuel, Anandakrishnan, Ramu, Feng, Wu-chun, Wahib, Mohamed
Format:	Preprint
Published:	2026
Subjects:	Distributed, Parallel, and Cluster Computing
Online Access:	https://arxiv.org/abs/2603.16721
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SAFE: Improving LLM Systems using Sentence-Level In-generation Attribution
by: Batista, João Eduardo, et al.
Published: (2025)

Balanced and Elastic End-to-end Training of Dynamic LLMs
by: Wahib, Mohamed, et al.
Published: (2025)

Can Tensor Cores Benefit Memory-Bound Kernels? (No!)
by: Zhang, Lingqi, et al.
Published: (2025)

Identifying Multi-Hit Cancer Drivers Without Massive Parallelization: A CP, MIP, and Column Generation Framework
by: Willemsen, Rick S. H., et al.
Published: (2026)

A Unifying Framework to Enable Artificial Intelligence in High Performance Computing Workflows
by: Domke, Jens, et al.
Published: (2025)

Distributed Genetic Algorithm for Feature Selection
by: Potter, Michael, et al.
Published: (2024)

SHIRO: Near-Optimal Communication Strategies for Distributed Sparse Matrix Multiplication
by: Zhuang, Chen, et al.
Published: (2025)

RAPTOR: Practical Numerical Profiling of Scientific Applications
by: Hoerold, Faveo, et al.
Published: (2025)

Looking for the Information Needle in the Internet Haystack.
by: Clausen, Helge
Published: (1996)

Tadashi: Enabling AI-Based Automated Code Generation With Guaranteed Correctness
by: Vatai, Emil, et al.
Published: (2024)

CG-Kit: Code Generation Toolkit for Performant and Maintainable Variants of Source Code Applied to Flash-X Hydrodynamics Simulations
by: Rudi, Johann, et al.
Published: (2024)

Practical GPU Choices for Earth Observation: ResNet-50 Training Throughput on Integrated, Laptop, and Cloud Accelerators
by: Chaturvedi, Ritvik
Published: (2025)

Scaling Large-scale GNN Training to Thousands of Processors on CPU-based Supercomputers
by: Zhuang, Chen, et al.
Published: (2024)

Efficient and Portable Support for Overdecomposition on Distributed Memory GPGPU Platforms
by: Bhosale, Aditya, et al.
Published: (2026)

Enabling Dynamic Sparsity in Quantized LLM Inference
by: Wang, Rongxiang, et al.
Published: (2025)

Performance bounds for priority-based stochastic coflow scheduling
by: Brun, Olivier, et al.
Published: (2025)

Genomic data processing with GenomeFlow
by: Park, Junseok, et al.
Published: (2025)

Parallel Online Directed Acyclic Graph Exploration for Atlasing Soft-Matter Assembly Configuration Spaces
by: Prabhu, Rahul, et al.
Published: (2024)

Sparsity-Aware Roofline Models for Sparse Matrix-Matrix Multiplication
by: Qian, Matthew, et al.
Published: (2026)

Parallel Order-Based Core Maintenance in Dynamic Graphs
by: Guo, Bin, et al.
Published: (2022)

Sparsity-Preserving Encodings for Straggler-Optimal Distributed Matrix Computations at the Edge
by: Das, Anindya Bijoy, et al.
Published: (2024)

Gradient Compression and Correlation Driven Federated Learning for Wireless Traffic Prediction
by: Zhang, Chuanting, et al.
Published: (2025)

Computing: Looking Back and Moving Forward
by: Golec, Muhammed, et al.
Published: (2024)

Paradigm Shift in Infrastructure Inspection Technology: Leveraging High-performance Imaging and Advanced AI Analytics to Inspect Road Infrastructure
by: Wu, Du, et al.
Published: (2025)

Empowering Distributed Training with Sparsity-driven Data Synchronization
by: Wang, Zhuang, et al.
Published: (2023)

Federated k-Core Decomposition: A Secure Distributed Approach
by: Guo, Bin, et al.
Published: (2024)

S-HPLB: Efficient LLM Attention Serving via Sparsity-Aware Head Parallelism Load Balance
by: Liu, Di, et al.
Published: (2026)

A Hierarchical Security Events Correlation Model for Real-time Cyber Threat Detection and Response
by: Maosa, Herbert, et al.
Published: (2023)

MSAO: Adaptive Modality Sparsity-Aware Offloading with Edge-Cloud Collaboration for Efficient Multimodal LLM Inference
by: Yang, Zheming, et al.
Published: (2026)

AMReX and pyAMReX: Looking Beyond ECP
by: Myers, Andrew, et al.
Published: (2024)

Adaptive Parallel Downloader for Large Genomic Datasets
by: Swargo, Rasman Mubtasim, et al.
Published: (2025)

Practical Performance Guarantees for Pipelined DNN Inference
by: Archer, Aaron, et al.
Published: (2023)

New Concurrent Order Maintenance Data Structure
by: Guo, Bin, et al.
Published: (2022)

SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
by: Li, Kaiwei, et al.
Published: (2016)

Predicting Temporal Aspects of Movement for Predictive Replication in Fog Environments
by: Balitzki, Emil, et al.
Published: (2023)

Exploiting Unstructured Sparsity in Fully Homomorphic Encrypted DNNs
by: Ferguson, Aidan, et al.
Published: (2025)

DualSparse-MoE: Coordinating Tensor/Neuron-Level Sparsity with Expert Partition and Reconstruction
by: Cai, Weilin, et al.
Published: (2025)

Comparison of Vectorization Capabilities of Different Compilers for X86 and ARM CPUs
by: Sakib, Nazmus, et al.
Published: (2025)

High-Performance Portable GPU Primitives for Arbitrary Types and Operators in Julia
by: Pilliat, Emmanuel
Published: (2026)

SwarmSearch: Decentralized Search Engine with Self-Funding Economy
by: Gregoriadis, Marcel, et al.
Published: (2025)