Saved in:
| Main Author: | Katagiri, Takahiro |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.16607 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RAO-SS: A Prototype of Run-time Auto-tuning Facility for Sparse Direct Solvers
by: Katagiri, Takahiro, et al.
Published: (2024)
by: Katagiri, Takahiro, et al.
Published: (2024)
An Auto-tuning Method for Run-time Data Transformation for Sparse Matrix-Vector Multiplication
by: Katagiri, Takahiro, et al.
Published: (2024)
by: Katagiri, Takahiro, et al.
Published: (2024)
Xabclib:A Fully Auto-tuned Sparse Iterative Solver
by: Katagiri, Takahiro, et al.
Published: (2024)
by: Katagiri, Takahiro, et al.
Published: (2024)
Towards Generalized Parameter Tuning in Coherent Ising Machines: A Portfolio-Based Approach
by: Hanyu, Tatsuro, et al.
Published: (2025)
by: Hanyu, Tatsuro, et al.
Published: (2025)
A Communication Avoiding and Reducing Algorithm for Symmetric Eigenproblem for Very Small Matrices
by: Katagiri, Takahiro, et al.
Published: (2024)
by: Katagiri, Takahiro, et al.
Published: (2024)
Performance Evaluation of CMOS Annealing with Support Vector Machine
by: Fukuhara, Ryoga, et al.
Published: (2024)
by: Fukuhara, Ryoga, et al.
Published: (2024)
MLKAPS: Machine Learning and Adaptive Sampling for HPC Kernel Auto-tuning
by: Jam, Mathys, et al.
Published: (2025)
by: Jam, Mathys, et al.
Published: (2025)
WaveTune: Wave-aware Bilinear Modeling for Efficient GPU Kernel Auto-tuning
by: Zhang, Kaixuan, et al.
Published: (2026)
by: Zhang, Kaixuan, et al.
Published: (2026)
Learning-Augmented Performance Model for Tensor Product Factorization in High-Order FEM
by: Ren, Xuanzhengbo, et al.
Published: (2026)
by: Ren, Xuanzhengbo, et al.
Published: (2026)
Bringing Auto-tuning to HIP: Analysis of Tuning Impact and Difficulty on AMD and Nvidia GPUs
by: Lurati, Milo, et al.
Published: (2024)
by: Lurati, Milo, et al.
Published: (2024)
iSpLib: A Library for Accelerating Graph Neural Networks using Auto-tuned Sparse Operations
by: Anik, Md Saidul Hoque, et al.
Published: (2024)
by: Anik, Md Saidul Hoque, et al.
Published: (2024)
Effects of the Auto-Correlation of Delays on the Age of Information: A Gaussian Process Framework
by: Inoie, Atsushi, et al.
Published: (2025)
by: Inoie, Atsushi, et al.
Published: (2025)
Performance Characterization of AutoNUMA Memory Tiering on Graph Analytics
by: Moura, Diego, et al.
Published: (2022)
by: Moura, Diego, et al.
Published: (2022)
A Microbenchmark Framework for Performance Evaluation of OpenMP Target Offloading
by: Atif, Mohammad, et al.
Published: (2025)
by: Atif, Mohammad, et al.
Published: (2025)
Integration of a systolic array based hardware accelerator into a DNN operator auto-tuning framework
by: Peccia, F. N., et al.
Published: (2022)
by: Peccia, F. N., et al.
Published: (2022)
Integrating ytopt and libEnsemble to Autotune OpenMC
by: Wu, Xingfu, et al.
Published: (2024)
by: Wu, Xingfu, et al.
Published: (2024)
Deploying Open-Source Large Language Models: A performance Analysis
by: Bendi-Ouis, Yannis, et al.
Published: (2024)
by: Bendi-Ouis, Yannis, et al.
Published: (2024)
AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search
by: Jaber, Jaber, et al.
Published: (2026)
by: Jaber, Jaber, et al.
Published: (2026)
AutoSAGE: Input-Aware CUDA Scheduling for Sparse GNN Aggregation (SpMM/SDDMM) and CSR Attention
by: Stankovic, Aleksandar
Published: (2025)
by: Stankovic, Aleksandar
Published: (2025)
Assessing the Performance of OpenTitan as Cryptographic Accelerator in Secure Open-Hardware System-on-Chips
by: Parisi, Emanuele, et al.
Published: (2024)
by: Parisi, Emanuele, et al.
Published: (2024)
DistZO2: High-Throughput and Memory-Efficient Zeroth-Order Fine-tuning LLMs with Distributed Parallel Computing
by: Wang, Liangyu, et al.
Published: (2025)
by: Wang, Liangyu, et al.
Published: (2025)
Towards a Scalable and Efficient PGAS-based Distributed OpenMP
by: Shan, Baodi, et al.
Published: (2024)
by: Shan, Baodi, et al.
Published: (2024)
AutoLALA: Automatic Loop Algebraic Locality Analysis for AI and HPC Kernels
by: Zhu, Yifan, et al.
Published: (2026)
by: Zhu, Yifan, et al.
Published: (2026)
Profiling Large Language Model Inference on Apple Silicon: A Quantization Perspective
by: Benazir, Afsara, et al.
Published: (2025)
by: Benazir, Afsara, et al.
Published: (2025)
CAPSim: A Fast CPU Performance Simulator Using Attention-based Predictor
by: Xu, Buqing, et al.
Published: (2025)
by: Xu, Buqing, et al.
Published: (2025)
Tuning the Tuner: Introducing Hyperparameter Optimization for Auto-Tuning
by: Willemsen, Floris-Jan, et al.
Published: (2025)
by: Willemsen, Floris-Jan, et al.
Published: (2025)
Optimizing Cloud-native Services with SAGA: A Service Affinity Graph-based Approach
by: Dinh-Tuan, Hai, et al.
Published: (2025)
by: Dinh-Tuan, Hai, et al.
Published: (2025)
AdaGradSelect: An adaptive gradient-guided layer selection method for efficient fine-tuning of SLMs
by: Kumar, Anshul, et al.
Published: (2025)
by: Kumar, Anshul, et al.
Published: (2025)
Understanding and Benchmarking Artificial Intelligence: OpenAI's o3 Is Not AGI
by: Pfister, Rolf, et al.
Published: (2025)
by: Pfister, Rolf, et al.
Published: (2025)
A relação entre a «performance» social e a «performance» económico-financeira
by: Daniel Taborda
Published: (2007)
by: Daniel Taborda
Published: (2007)
From Concept to Reality: 5G Positioning with Open-Source Implementation of UL-TDoA in OpenAirInterface
by: Malik, Adeel, et al.
Published: (2024)
by: Malik, Adeel, et al.
Published: (2024)
An Empirical Study on Method-Level Performance Evolution in Open-Source Java Projects
by: Shahedi, Kaveh, et al.
Published: (2025)
by: Shahedi, Kaveh, et al.
Published: (2025)
AI Application Benchmarking: Power-Aware Performance Analysis for Vision and Language Models
by: Mayr, Martin, et al.
Published: (2026)
by: Mayr, Martin, et al.
Published: (2026)
PowerSensor3: A Fast and Accurate Open Source Power Measurement Tool
by: van der Vlugt, Steven, et al.
Published: (2025)
by: van der Vlugt, Steven, et al.
Published: (2025)
DIAL: Decentralized I/O AutoTuning via Learned Client-side Local Metrics for Parallel File System
by: Rashid, Md Hasanur, et al.
Published: (2026)
by: Rashid, Md Hasanur, et al.
Published: (2026)
Automated PMC-based Power Modeling Methodology for Modern Mobile GPUs
by: Dash, Pranab, et al.
Published: (2024)
by: Dash, Pranab, et al.
Published: (2024)
Impact of Generative AI (Large Language Models) on the PRA model construction and maintenance, observations
by: Rychkov, Valentin, et al.
Published: (2024)
by: Rychkov, Valentin, et al.
Published: (2024)
Inspection of I/O Operations from System Call Traces using Directly-Follows-Graph
by: Sankaran, Aravind, et al.
Published: (2024)
by: Sankaran, Aravind, et al.
Published: (2024)
Optimizing OpenFaaS on Kubernetes: Comparative Analysis of Language Runtimes and Cluster Distributions
by: Ataie, Ehsan, et al.
Published: (2026)
by: Ataie, Ehsan, et al.
Published: (2026)
Tuning Fast Memory Size based on Modeling of Page Migration for Tiered Memory
by: Chen, Shangye, et al.
Published: (2024)
by: Chen, Shangye, et al.
Published: (2024)
Similar Items
-
RAO-SS: A Prototype of Run-time Auto-tuning Facility for Sparse Direct Solvers
by: Katagiri, Takahiro, et al.
Published: (2024) -
An Auto-tuning Method for Run-time Data Transformation for Sparse Matrix-Vector Multiplication
by: Katagiri, Takahiro, et al.
Published: (2024) -
Xabclib:A Fully Auto-tuned Sparse Iterative Solver
by: Katagiri, Takahiro, et al.
Published: (2024) -
Towards Generalized Parameter Tuning in Coherent Ising Machines: A Portfolio-Based Approach
by: Hanyu, Tatsuro, et al.
Published: (2025) -
A Communication Avoiding and Reducing Algorithm for Symmetric Eigenproblem for Very Small Matrices
by: Katagiri, Takahiro, et al.
Published: (2024)