Guardado en:
| Autor principal: | Ke, Chih-Hua |
|---|---|
| Formato: | Preprint |
| Publicado: |
2026
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2605.08725 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
SAHM: State-Aware Heterogeneous Multicore for Single-Thread Performance
por: Wadle, Shayne, et al.
Publicado: (2025)
por: Wadle, Shayne, et al.
Publicado: (2025)
Simulation-Driven Evaluation of Chiplet-Based Architectures Using VisualSim
por: Ali, Wajid, et al.
Publicado: (2025)
por: Ali, Wajid, et al.
Publicado: (2025)
How to Increase Energy Efficiency with a Single Linux Command
por: Jelvani, Alborz, et al.
Publicado: (2025)
por: Jelvani, Alborz, et al.
Publicado: (2025)
PipeWeave: Synergizing Analytical and Learning Models for Unified GPU Performance Prediction
por: Zhang, Kaixuan, et al.
Publicado: (2026)
por: Zhang, Kaixuan, et al.
Publicado: (2026)
Towards CPU Performance Prediction: New Challenge Benchmark Dataset and Novel Approach
por: Liu, Xiaoman
Publicado: (2024)
por: Liu, Xiaoman
Publicado: (2024)
Data-Driven Power Modeling and Monitoring via Hardware Performance Counter Tracking
por: Mazzola, Sergio, et al.
Publicado: (2025)
por: Mazzola, Sergio, et al.
Publicado: (2025)
USEFUSE: Uniform Stride for Enhanced Performance in Fused Layer Architecture of Deep Neural Networks
por: Ibrahim, Muhammad Sohail, et al.
Publicado: (2024)
por: Ibrahim, Muhammad Sohail, et al.
Publicado: (2024)
ACALSim: A Scalable Parallel Simulation Framework for High-Performance System Design Space Exploration
por: Lin, Wei-Fen, et al.
Publicado: (2026)
por: Lin, Wei-Fen, et al.
Publicado: (2026)
Gem5-AcceSys: Enabling System-Level Exploration of Standard Interconnects for Novel Accelerators
por: Liu, Qunyou, et al.
Publicado: (2025)
por: Liu, Qunyou, et al.
Publicado: (2025)
Strassen Multisystolic Array Hardware Architectures
por: Pogue, Trevor E., et al.
Publicado: (2025)
por: Pogue, Trevor E., et al.
Publicado: (2025)
JSPIM: A Skew-Aware PIM Accelerator for High-Performance Databases Join and Select Operations
por: Tajdari, Sabiha, et al.
Publicado: (2025)
por: Tajdari, Sabiha, et al.
Publicado: (2025)
GDEV-AI: A Generalized Evaluation of Deep Learning Inference Scaling and Architectural Saturation
por: Palaniappan, Kathiravan
Publicado: (2026)
por: Palaniappan, Kathiravan
Publicado: (2026)
Enhancing Instruction Prefetching via Cache and TLB Management
por: Jamet, Alexandre Valentin, et al.
Publicado: (2026)
por: Jamet, Alexandre Valentin, et al.
Publicado: (2026)
ETM2: Empowering Traditional Memory Bandwidth Regulation using ETM
por: Zuepke, Alexander, et al.
Publicado: (2026)
por: Zuepke, Alexander, et al.
Publicado: (2026)
SPEC CPU2026: Characterization, Representativeness, and Cross-Suite Comparison
por: Li, Ruihao, et al.
Publicado: (2026)
por: Li, Ruihao, et al.
Publicado: (2026)
Regular-Dead on Arrival: Characterizing and Protecting Against Dead-Entry TLB Misses in GPU Microarchitectures
por: Anik, Shafayat Mowla, et al.
Publicado: (2026)
por: Anik, Shafayat Mowla, et al.
Publicado: (2026)
Range, Not Precision: Block-Floating-Point Half-Precision FFT and SAR Imaging on Apple Silicon
por: Bergach, Mohamed Amine
Publicado: (2026)
por: Bergach, Mohamed Amine
Publicado: (2026)
SCALE-Sim TPU: Validating and Extending SCALE-Sim for TPUs
por: Dang, Jingtian, et al.
Publicado: (2026)
por: Dang, Jingtian, et al.
Publicado: (2026)
WaveTune: Wave-aware Bilinear Modeling for Efficient GPU Kernel Auto-tuning
por: Zhang, Kaixuan, et al.
Publicado: (2026)
por: Zhang, Kaixuan, et al.
Publicado: (2026)
Heterogeneous Memory Benchmarking Toolkit
por: Ghaemi, Golsana, et al.
Publicado: (2025)
por: Ghaemi, Golsana, et al.
Publicado: (2025)
Adaptive Cache Pollution Control for Large Language Model Inference Workloads Using Temporal CNN-Based Prediction and Priority-Aware Replacement
por: Liu, Songze, et al.
Publicado: (2025)
por: Liu, Songze, et al.
Publicado: (2025)
Recurrent CircuitSAT Sampling for Sequential Circuits
por: Ardakani, Arash, et al.
Publicado: (2025)
por: Ardakani, Arash, et al.
Publicado: (2025)
Introducing the Arm-membench Throughput Benchmark
por: Burth, Cyrill, et al.
Publicado: (2025)
por: Burth, Cyrill, et al.
Publicado: (2025)
Enhancing software-hardware co-design for HEP by low-overhead profiling of single- and multi-threaded programs on diverse architectures with Adaptyst
por: Graczyk, Maksymilian, et al.
Publicado: (2025)
por: Graczyk, Maksymilian, et al.
Publicado: (2025)
Makinote: An FPGA-Based HW/SW Platform for Pre-Silicon Emulation of RISC-V Designs
por: Perdomo, Elias, et al.
Publicado: (2024)
por: Perdomo, Elias, et al.
Publicado: (2024)
AI Load Dynamics--A Power Electronics Perspective
por: Li, Yuzhuo, et al.
Publicado: (2025)
por: Li, Yuzhuo, et al.
Publicado: (2025)
ONNXim: A Fast, Cycle-level Multi-core NPU Simulator
por: Ham, Hyungkyu, et al.
Publicado: (2024)
por: Ham, Hyungkyu, et al.
Publicado: (2024)
LightningSimV2: Faster and Scalable Simulation for High-Level Synthesis via Graph Compilation and Optimization
por: Sarkar, Rishov, et al.
Publicado: (2024)
por: Sarkar, Rishov, et al.
Publicado: (2024)
CXL-Interference: Analysis and Characterization in Modern Computer Systems
por: Mao, Shunyu, et al.
Publicado: (2024)
por: Mao, Shunyu, et al.
Publicado: (2024)
OPTIMA: Design-Space Exploration of Discharge-Based In-SRAM Computing: Quantifying Energy-Accuracy Trade-Offs
por: Seyedfaraji, Saeed, et al.
Publicado: (2024)
por: Seyedfaraji, Saeed, et al.
Publicado: (2024)
An Analytical Cost Model for Fast Evaluation of Multiple Compute-Engine CNN Accelerators
por: Qararyah, Fareed, et al.
Publicado: (2025)
por: Qararyah, Fareed, et al.
Publicado: (2025)
Characterizing and Optimizing Realistic Workloads on a Commercial Compute-in-SRAM Device
por: Zhang, Niansong, et al.
Publicado: (2025)
por: Zhang, Niansong, et al.
Publicado: (2025)
Accelerating Transistor-Level Simulation of Integrated Circuits via Equivalence of RC Long-Chain Structures
por: Tang, Ruibai, et al.
Publicado: (2025)
por: Tang, Ruibai, et al.
Publicado: (2025)
The Bicameral Cache: a split cache for vector architectures
por: Rebolledo, Susana, et al.
Publicado: (2024)
por: Rebolledo, Susana, et al.
Publicado: (2024)
A Quantitative Analysis and Guidelines of Data Streaming Accelerator in Modern Intel Xeon Scalable Processors
por: Kuper, Reese, et al.
Publicado: (2023)
por: Kuper, Reese, et al.
Publicado: (2023)
A$^3$PIM: An Automated, Analytic and Accurate Processing-in-Memory Offloader
por: Jiang, Qingcai, et al.
Publicado: (2024)
por: Jiang, Qingcai, et al.
Publicado: (2024)
SCALE-Sim v3: A modular cycle-accurate systolic accelerator simulator for end-to-end system analysis
por: Raj, Ritik, et al.
Publicado: (2025)
por: Raj, Ritik, et al.
Publicado: (2025)
OmniSim: Simulating Hardware with C Speed and RTL Accuracy for High-Level Synthesis Designs
por: Sarkar, Rishov, et al.
Publicado: (2025)
por: Sarkar, Rishov, et al.
Publicado: (2025)
GeneTEK: Low-power, high-performance and scalable FPGA architecture for exact unit-cost edit distance
por: Espinosa, Elena, et al.
Publicado: (2025)
por: Espinosa, Elena, et al.
Publicado: (2025)
Análisis de rendimiento y eficiencia energética en el cluster Raspberry Pi Cronos
por: Semken, Martha, et al.
Publicado: (2025)
por: Semken, Martha, et al.
Publicado: (2025)
Ejemplares similares
-
SAHM: State-Aware Heterogeneous Multicore for Single-Thread Performance
por: Wadle, Shayne, et al.
Publicado: (2025) -
Simulation-Driven Evaluation of Chiplet-Based Architectures Using VisualSim
por: Ali, Wajid, et al.
Publicado: (2025) -
How to Increase Energy Efficiency with a Single Linux Command
por: Jelvani, Alborz, et al.
Publicado: (2025) -
PipeWeave: Synergizing Analytical and Learning Models for Unified GPU Performance Prediction
por: Zhang, Kaixuan, et al.
Publicado: (2026) -
Towards CPU Performance Prediction: New Challenge Benchmark Dataset and Novel Approach
por: Liu, Xiaoman
Publicado: (2024)