Saved in:
| Main Authors: | Kurzynski, Marco, Sinclair, Matthew D. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.18113 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
parti-gem5: gem5's Timing Mode Parallelised
by: Cubero-Cascante, José, et al.
Published: (2023)
by: Cubero-Cascante, José, et al.
Published: (2023)
Toward Reproducible and Standardized Computer Architecture Simulation with gem5
by: Pai, Kunal, et al.
Published: (2025)
by: Pai, Kunal, et al.
Published: (2025)
Understanding Simulated Architecture via gem5 Call-Stack Profiling
by: Söderström, Johan, et al.
Published: (2026)
by: Söderström, Johan, et al.
Published: (2026)
CHAOS: Controlled Hardware fAult injectOr System for gem5
by: Vinciguerra, Elio, et al.
Published: (2026)
by: Vinciguerra, Elio, et al.
Published: (2026)
Chopper: A Multi-Level GPU Characterization Tool & Derived Insights Into LLM Training Inefficiency
by: Kurzynski, Marco, et al.
Published: (2025)
by: Kurzynski, Marco, et al.
Published: (2025)
Lit Silicon: A Case Where Thermal Imbalance Couples Concurrent Execution in Multiple GPUs
by: Kurzynski, Marco, et al.
Published: (2025)
by: Kurzynski, Marco, et al.
Published: (2025)
gem5 Co-Pilot: AI Assistant Agent for Architectural Design Space Exploration
by: Fu, Zuoming, et al.
Published: (2025)
by: Fu, Zuoming, et al.
Published: (2025)
Advancing Cloud Computing Capabilities on gem5 by Implementing the RISC-V Hypervisor Extension
by: Fragkoulis, George-Marios, et al.
Published: (2024)
by: Fragkoulis, George-Marios, et al.
Published: (2024)
Anatomy of the gem5 Simulator: AtomicSimpleCPU, TimingSimpleCPU, O3CPU, and Their Interaction with the Ruby Memory System
by: Söderström, Johan, et al.
Published: (2025)
by: Söderström, Johan, et al.
Published: (2025)
CXL-ClusterSim: Modeling CXL-based Disaggregated Memory Cluster for Pooling and Sharing using gem5 and SST
by: Goswami, Kaustav, et al.
Published: (2026)
by: Goswami, Kaustav, et al.
Published: (2026)
Extend IVerilog to Support Batch RTL Fault Simulation
by: Tang, Jiaping, et al.
Published: (2025)
by: Tang, Jiaping, et al.
Published: (2025)
Support Vector Machines Classification on Bendable RISC-V
by: Vergos, Polykarpos, et al.
Published: (2025)
by: Vergos, Polykarpos, et al.
Published: (2025)
Multiport Support for Vortex OpenGPU Memory Hierarchy
by: Shin, Injae, et al.
Published: (2025)
by: Shin, Injae, et al.
Published: (2025)
3D MPSoC with On-Chip Cache Support -- Design and Exploitation
by: Cataldo, Rodrigo, et al.
Published: (2025)
by: Cataldo, Rodrigo, et al.
Published: (2025)
Architecture, Simulation and Software Stack to Support Post-CMOS Accelerators: The ARCHYTAS Project
by: Agosta, Giovanni, et al.
Published: (2025)
by: Agosta, Giovanni, et al.
Published: (2025)
VIKIN: A Reconfigurable Accelerator for KANs and MLPs with Two-Stage Sparsity Support
by: Ou, Wenhui, et al.
Published: (2026)
by: Ou, Wenhui, et al.
Published: (2026)
Optimized Memory System Architecture for VESA VDC-M Decoder with Multi-Slice Support
by: Yang, Hannah, et al.
Published: (2025)
by: Yang, Hannah, et al.
Published: (2025)
Splatonic: Architecture Support for 3D Gaussian Splatting SLAM via Sparse Processing
by: Huang, Xiaotong, et al.
Published: (2025)
by: Huang, Xiaotong, et al.
Published: (2025)
DR-CGRA: Supporting Loop-Carried Dependencies in CGRAs Without Spilling Intermediate Values
by: Hadar, Elad, et al.
Published: (2024)
by: Hadar, Elad, et al.
Published: (2024)
A Node-Based Polar List Decoder with Frame Interleaving and Ensemble Decoding Support
by: Ren, Yuqing, et al.
Published: (2024)
by: Ren, Yuqing, et al.
Published: (2024)
DeMM: A Decoupled Matrix Multiplication Engine Supporting Relaxed Structured Sparsity
by: Peltekis, Christodoulos, et al.
Published: (2024)
by: Peltekis, Christodoulos, et al.
Published: (2024)
Energy-adaptive Buffering for Efficient, Responsive, and Persistent Batteryless Systems
by: Williams, Harrison, et al.
Published: (2024)
by: Williams, Harrison, et al.
Published: (2024)
FEATHER: A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching
by: Tong, Jianming, et al.
Published: (2024)
by: Tong, Jianming, et al.
Published: (2024)
PUMA: Efficient and Low-Cost Memory Allocation and Alignment Support for Processing-Using-Memory Architectures
by: Oliveira, Geraldo F., et al.
Published: (2024)
by: Oliveira, Geraldo F., et al.
Published: (2024)
SEGA-DCIM: Design Space Exploration-Guided Automatic Digital CIM Compiler with Multiple Precision Support
by: Diao, Haikang, et al.
Published: (2025)
by: Diao, Haikang, et al.
Published: (2025)
Jack Unit: An Area- and Energy-Efficient Multiply-Accumulate (MAC) Unit Supporting Diverse Data Formats
by: Noh, Seock-Hwan, et al.
Published: (2025)
by: Noh, Seock-Hwan, et al.
Published: (2025)
ReDas: A Lightweight Architecture for Supporting Fine-Grained Reshaping and Multiple Dataflows on Systolic Array
by: Han, Meng, et al.
Published: (2023)
by: Han, Meng, et al.
Published: (2023)
Global Optimizations & Lightweight Dynamic Logic for Concurrency
by: Pati, Suchita, et al.
Published: (2024)
by: Pati, Suchita, et al.
Published: (2024)
All-rounder: A Flexible AI Accelerator with Diverse Data Format Support and Morphable Structure for Multi-DNN Processing
by: Noh, Seock-Hwan, et al.
Published: (2023)
by: Noh, Seock-Hwan, et al.
Published: (2023)
From RTL to Prompt Coding: Empowering the Next Generation of Chip Designers through LLMs
by: Krupp, Lukas, et al.
Published: (2026)
by: Krupp, Lukas, et al.
Published: (2026)
FastFlow in FPGA Stacks of Data Centers
by: Paul, Rourab, et al.
Published: (2024)
by: Paul, Rourab, et al.
Published: (2024)
An Open-Source Flow for Single-Phase, Edge-Triggered to Two-Phase, Non-Overlapping Clocking Conversion
by: Pedroso, Paolo, et al.
Published: (2026)
by: Pedroso, Paolo, et al.
Published: (2026)
High-Level Synthesis of Digital Circuits from Template Haskell and SDF-AP
by: Folmer, Hendrik, et al.
Published: (2025)
by: Folmer, Hendrik, et al.
Published: (2025)
Optimizing Scalable Multi-Cluster Architectures for Next-Generation Wireless Sensing and Communication
by: Riedel, Samuel, et al.
Published: (2025)
by: Riedel, Samuel, et al.
Published: (2025)
Functional ISS-Driven Verification of Superscalar RISC-V Processors
by: Galimberti, Andrea, et al.
Published: (2024)
by: Galimberti, Andrea, et al.
Published: (2024)
FlexCross: High-Speed and Flexible Packet Processing via a Crosspoint-Queued Crossbar
by: Zyla, Klajd, et al.
Published: (2024)
by: Zyla, Klajd, et al.
Published: (2024)
Hardware and software build flow with SoCMake
by: Pejašinović, Risto, et al.
Published: (2025)
by: Pejašinović, Risto, et al.
Published: (2025)
A Dynamic Allocation Scheme for Adaptive Shared-Memory Mapping on Kilo-core RV Clusters for Attention-Based Model Deployment
by: Wang, Bowen, et al.
Published: (2025)
by: Wang, Bowen, et al.
Published: (2025)
Tensor Memory Engine: On-the-fly Data Reorganization for Ideal Locality
by: Hoornaert, Denis, et al.
Published: (2026)
by: Hoornaert, Denis, et al.
Published: (2026)
FlooNoC: A 645 Gbps/link 0.15 pJ/B/hop Open-Source NoC with Wide Physical Links and End-to-End AXI4 Parallel Multi-Stream Support
by: Fischer, Tim, et al.
Published: (2024)
by: Fischer, Tim, et al.
Published: (2024)
Similar Items
-
parti-gem5: gem5's Timing Mode Parallelised
by: Cubero-Cascante, José, et al.
Published: (2023) -
Toward Reproducible and Standardized Computer Architecture Simulation with gem5
by: Pai, Kunal, et al.
Published: (2025) -
Understanding Simulated Architecture via gem5 Call-Stack Profiling
by: Söderström, Johan, et al.
Published: (2026) -
CHAOS: Controlled Hardware fAult injectOr System for gem5
by: Vinciguerra, Elio, et al.
Published: (2026) -
Chopper: A Multi-Level GPU Characterization Tool & Derived Insights Into LLM Training Inefficiency
by: Kurzynski, Marco, et al.
Published: (2025)