Saved in:
| Main Authors: | Prabhu, Ritvik, Vatai, Emil, Moussad, Bernard, Jeannot, Emmanuel, Anandakrishnan, Ramu, Feng, Wu-chun, Wahib, Mohamed |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.16721 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SAFE: Improving LLM Systems using Sentence-Level In-generation Attribution
by: Batista, João Eduardo, et al.
Published: (2025)
by: Batista, João Eduardo, et al.
Published: (2025)
Balanced and Elastic End-to-end Training of Dynamic LLMs
by: Wahib, Mohamed, et al.
Published: (2025)
by: Wahib, Mohamed, et al.
Published: (2025)
Can Tensor Cores Benefit Memory-Bound Kernels? (No!)
by: Zhang, Lingqi, et al.
Published: (2025)
by: Zhang, Lingqi, et al.
Published: (2025)
Identifying Multi-Hit Cancer Drivers Without Massive Parallelization: A CP, MIP, and Column Generation Framework
by: Willemsen, Rick S. H., et al.
Published: (2026)
by: Willemsen, Rick S. H., et al.
Published: (2026)
A Unifying Framework to Enable Artificial Intelligence in High Performance Computing Workflows
by: Domke, Jens, et al.
Published: (2025)
by: Domke, Jens, et al.
Published: (2025)
Distributed Genetic Algorithm for Feature Selection
by: Potter, Michael, et al.
Published: (2024)
by: Potter, Michael, et al.
Published: (2024)
SHIRO: Near-Optimal Communication Strategies for Distributed Sparse Matrix Multiplication
by: Zhuang, Chen, et al.
Published: (2025)
by: Zhuang, Chen, et al.
Published: (2025)
RAPTOR: Practical Numerical Profiling of Scientific Applications
by: Hoerold, Faveo, et al.
Published: (2025)
by: Hoerold, Faveo, et al.
Published: (2025)
Looking for the Information Needle in the Internet Haystack.
by: Clausen, Helge
Published: (1996)
by: Clausen, Helge
Published: (1996)
Tadashi: Enabling AI-Based Automated Code Generation With Guaranteed Correctness
by: Vatai, Emil, et al.
Published: (2024)
by: Vatai, Emil, et al.
Published: (2024)
CG-Kit: Code Generation Toolkit for Performant and Maintainable Variants of Source Code Applied to Flash-X Hydrodynamics Simulations
by: Rudi, Johann, et al.
Published: (2024)
by: Rudi, Johann, et al.
Published: (2024)
Practical GPU Choices for Earth Observation: ResNet-50 Training Throughput on Integrated, Laptop, and Cloud Accelerators
by: Chaturvedi, Ritvik
Published: (2025)
by: Chaturvedi, Ritvik
Published: (2025)
Scaling Large-scale GNN Training to Thousands of Processors on CPU-based Supercomputers
by: Zhuang, Chen, et al.
Published: (2024)
by: Zhuang, Chen, et al.
Published: (2024)
Efficient and Portable Support for Overdecomposition on Distributed Memory GPGPU Platforms
by: Bhosale, Aditya, et al.
Published: (2026)
by: Bhosale, Aditya, et al.
Published: (2026)
Enabling Dynamic Sparsity in Quantized LLM Inference
by: Wang, Rongxiang, et al.
Published: (2025)
by: Wang, Rongxiang, et al.
Published: (2025)
Performance bounds for priority-based stochastic coflow scheduling
by: Brun, Olivier, et al.
Published: (2025)
by: Brun, Olivier, et al.
Published: (2025)
Genomic data processing with GenomeFlow
by: Park, Junseok, et al.
Published: (2025)
by: Park, Junseok, et al.
Published: (2025)
Parallel Online Directed Acyclic Graph Exploration for Atlasing Soft-Matter Assembly Configuration Spaces
by: Prabhu, Rahul, et al.
Published: (2024)
by: Prabhu, Rahul, et al.
Published: (2024)
Sparsity-Aware Roofline Models for Sparse Matrix-Matrix Multiplication
by: Qian, Matthew, et al.
Published: (2026)
by: Qian, Matthew, et al.
Published: (2026)
Parallel Order-Based Core Maintenance in Dynamic Graphs
by: Guo, Bin, et al.
Published: (2022)
by: Guo, Bin, et al.
Published: (2022)
Sparsity-Preserving Encodings for Straggler-Optimal Distributed Matrix Computations at the Edge
by: Das, Anindya Bijoy, et al.
Published: (2024)
by: Das, Anindya Bijoy, et al.
Published: (2024)
Gradient Compression and Correlation Driven Federated Learning for Wireless Traffic Prediction
by: Zhang, Chuanting, et al.
Published: (2025)
by: Zhang, Chuanting, et al.
Published: (2025)
Computing: Looking Back and Moving Forward
by: Golec, Muhammed, et al.
Published: (2024)
by: Golec, Muhammed, et al.
Published: (2024)
Paradigm Shift in Infrastructure Inspection Technology: Leveraging High-performance Imaging and Advanced AI Analytics to Inspect Road Infrastructure
by: Wu, Du, et al.
Published: (2025)
by: Wu, Du, et al.
Published: (2025)
Empowering Distributed Training with Sparsity-driven Data Synchronization
by: Wang, Zhuang, et al.
Published: (2023)
by: Wang, Zhuang, et al.
Published: (2023)
Federated k-Core Decomposition: A Secure Distributed Approach
by: Guo, Bin, et al.
Published: (2024)
by: Guo, Bin, et al.
Published: (2024)
S-HPLB: Efficient LLM Attention Serving via Sparsity-Aware Head Parallelism Load Balance
by: Liu, Di, et al.
Published: (2026)
by: Liu, Di, et al.
Published: (2026)
A Hierarchical Security Events Correlation Model for Real-time Cyber Threat Detection and Response
by: Maosa, Herbert, et al.
Published: (2023)
by: Maosa, Herbert, et al.
Published: (2023)
MSAO: Adaptive Modality Sparsity-Aware Offloading with Edge-Cloud Collaboration for Efficient Multimodal LLM Inference
by: Yang, Zheming, et al.
Published: (2026)
by: Yang, Zheming, et al.
Published: (2026)
AMReX and pyAMReX: Looking Beyond ECP
by: Myers, Andrew, et al.
Published: (2024)
by: Myers, Andrew, et al.
Published: (2024)
Adaptive Parallel Downloader for Large Genomic Datasets
by: Swargo, Rasman Mubtasim, et al.
Published: (2025)
by: Swargo, Rasman Mubtasim, et al.
Published: (2025)
Practical Performance Guarantees for Pipelined DNN Inference
by: Archer, Aaron, et al.
Published: (2023)
by: Archer, Aaron, et al.
Published: (2023)
New Concurrent Order Maintenance Data Structure
by: Guo, Bin, et al.
Published: (2022)
by: Guo, Bin, et al.
Published: (2022)
SaberLDA: Sparsity-Aware Learning of Topic Models on GPUs
by: Li, Kaiwei, et al.
Published: (2016)
by: Li, Kaiwei, et al.
Published: (2016)
Predicting Temporal Aspects of Movement for Predictive Replication in Fog Environments
by: Balitzki, Emil, et al.
Published: (2023)
by: Balitzki, Emil, et al.
Published: (2023)
Exploiting Unstructured Sparsity in Fully Homomorphic Encrypted DNNs
by: Ferguson, Aidan, et al.
Published: (2025)
by: Ferguson, Aidan, et al.
Published: (2025)
DualSparse-MoE: Coordinating Tensor/Neuron-Level Sparsity with Expert Partition and Reconstruction
by: Cai, Weilin, et al.
Published: (2025)
by: Cai, Weilin, et al.
Published: (2025)
Comparison of Vectorization Capabilities of Different Compilers for X86 and ARM CPUs
by: Sakib, Nazmus, et al.
Published: (2025)
by: Sakib, Nazmus, et al.
Published: (2025)
High-Performance Portable GPU Primitives for Arbitrary Types and Operators in Julia
by: Pilliat, Emmanuel
Published: (2026)
by: Pilliat, Emmanuel
Published: (2026)
SwarmSearch: Decentralized Search Engine with Self-Funding Economy
by: Gregoriadis, Marcel, et al.
Published: (2025)
by: Gregoriadis, Marcel, et al.
Published: (2025)
Similar Items
-
SAFE: Improving LLM Systems using Sentence-Level In-generation Attribution
by: Batista, João Eduardo, et al.
Published: (2025) -
Balanced and Elastic End-to-end Training of Dynamic LLMs
by: Wahib, Mohamed, et al.
Published: (2025) -
Can Tensor Cores Benefit Memory-Bound Kernels? (No!)
by: Zhang, Lingqi, et al.
Published: (2025) -
Identifying Multi-Hit Cancer Drivers Without Massive Parallelization: A CP, MIP, and Column Generation Framework
by: Willemsen, Rick S. H., et al.
Published: (2026) -
A Unifying Framework to Enable Artificial Intelligence in High Performance Computing Workflows
by: Domke, Jens, et al.
Published: (2025)