:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Wenbo, Liu, Yiqi, Bao, Zhenshan
Format:	Preprint
Published:	2024
Subjects:	Hardware Architecture
Online Access:	https://arxiv.org/abs/2409.09689
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Optimizing GEMM for Energy and Performance on Versal ACAP Architectures
by: Papalamprou, Ilias, et al.
Published: (2025)

WideSA: A High Array Utilization Mapping Scheme for Uniform Recurrences on the Versal ACAP Architecture
by: Dai, Tuo, et al.
Published: (2024)

DPUV4E: High-Throughput DPU Architecture Design for CNN on Versal ACAP
by: Li, Guoyu, et al.
Published: (2025)

AP-DRL: A Synergistic Algorithm-Hardware Framework for Automatic Task Partitioning of Deep Reinforcement Learning on Versal ACAP
by: Li, Enlai, et al.
Published: (2026)

Accelerating CRONet on AMD Versal AIE-ML Engines
by: Mhatre, Kaustubh, et al.
Published: (2026)

Accelerating Elliptic Curve Point Additions on Versal AI Engine for Multi-scalar Multiplication
by: Ohno, Ayumi, et al.
Published: (2025)

GAMA: High-Performance GEMM Acceleration on AMD Versal ML-Optimized AI Engines
by: Mhatre, Kaustubh, et al.
Published: (2025)

AMD Versal Implementations of FAM and SSCA Estimators
by: Li, Carol Jingyi, et al.
Published: (2025)

Exploring the Versal AI Engine for 3D Gaussian Splatting
by: Shimamura, Kotaro, et al.
Published: (2025)

Enabling Mixed criticality applications for the Versal AI-Engines
by: Sprave, Vincent, et al.
Published: (2026)

Lyra: A Hardware-Accelerated RISC-V Verification Framework with Generative Model-Based Processor Fuzzing
by: Huo, Juncheng, et al.
Published: (2025)

ApproxPilot: A GNN-based Accelerator Approximation Framework
by: Zhang, Qing, et al.
Published: (2024)

AGON: Automated Design Framework for Customizing Processors from ISA Documents
by: Li, Chongxiao, et al.
Published: (2024)

An Efficient Sparse Hardware Accelerator for Spike-Driven Transformer
by: Li, Zhengke, et al.
Published: (2025)

FPGA-Optimized Hardware Accelerator for Fast Fourier Transform and Singular Value Decomposition in AI
by: Ding, Hong, et al.
Published: (2025)

Efficient Implementation of an Adaptive Transformer Accelerator for Massive MIMO Outdoor Localization
by: Yaman, Ilayda, et al.
Published: (2026)

TATAA: Programmable Mixed-Precision Transformer Acceleration with a Transformable Arithmetic Architecture
by: Wu, Jiajun, et al.
Published: (2024)

Holistic Optimization Framework for FPGA Accelerators
by: Pouget, Stéphane, et al.
Published: (2025)

Optimized Spatial Architecture Mapping Flow for Transformer Accelerators
by: Xu, Haocheng, et al.
Published: (2024)

GSIM: Accelerating RTL Simulation for Large-Scale Designs
by: Chen, Lu, et al.
Published: (2025)

Mozart: A Chiplet Ecosystem-Accelerator Codesign Framework for Composable Bespoke Application Specific Integrated Circuits
by: Jin, Haoran, et al.
Published: (2025)

CAMASim: A Comprehensive Simulation Framework for Content-Addressable Memory based Accelerators
by: Li, Mengyuan, et al.
Published: (2024)

Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers
by: Song, Zihang, et al.
Published: (2024)

Scope: A Scalable Merged Pipeline Framework for Multi-Chip-Module NN Accelerators
by: Huang, Zongle, et al.
Published: (2026)

FPGA-based Emulation and Device-Side Management for CXL-based Memory Tiering Systems
by: Chen, Yiqi, et al.
Published: (2025)

PIM-GPT: A Hybrid Process-in-Memory Accelerator for Autoregressive Transformers
by: Wu, Yuting, et al.
Published: (2023)

MARVEL: An End-to-End Framework for Generating Model-Class Aware Custom RISC-V Extensions for Lightweight AI
by: M, Ajay Kumar, et al.
Published: (2025)

FireFly-T: High-Throughput Sparsity Exploitation for Spiking Transformer Acceleration with Dual-Engine Overlay Architecture
by: Li, Tenglong, et al.
Published: (2025)

Ouroboros: Wafer-Scale SRAM CIM with Token-Grained Pipelining for Large Language Model Inference
by: Liu, Yiqi, et al.
Published: (2026)

Custom Algorithm-based Fault Tolerance for Attention Layers in Transformers
by: Titopoulos, Vasileios, et al.
Published: (2025)

Enabling Efficient Hardware Acceleration of Hybrid Vision Transformer (ViT) Networks at the Edge
by: Dumoulin, Joren, et al.
Published: (2025)

Hardware Efficient Accelerator for Spiking Transformer With Reconfigurable Parallel Time Step Computing
by: Chen, Bo-Yu, et al.
Published: (2025)

Hardware-Software Co-Design for Accelerating Transformer Inference Leveraging Compute-in-Memory
by: Kim, Dong Eun, et al.
Published: (2025)

A Reconfigurable Framework for AI-FPGA Agent Integration and Acceleration
by: Yunusoglu, Aybars, et al.
Published: (2026)

TurboFuzz: FPGA Accelerated Hardware Fuzzing for Processor Agile Verification
by: Zhong, Yang, et al.
Published: (2025)

NX-CGRA: A Programmable Hardware Accelerator for Core Transformer Algorithms on Edge Devices
by: Prasad, Rohit
Published: (2025)

A Dataflow Compiler for Efficient LLM Inference using Custom Microscaling Formats
by: Cheng, Jianyi, et al.
Published: (2023)

PIMSIM-NN: An ISA-based Simulation Framework for Processing-in-Memory Accelerators
by: Wang, Xinyu, et al.
Published: (2024)

TeAAL: A Declarative Framework for Modeling Sparse Tensor Accelerators
by: Nayak, Nandeeka, et al.
Published: (2023)

Memory-Guided Unified Hardware Accelerator for Mixed-Precision Scientific Computing
by: Wang, Chuanzhen, et al.
Published: (2026)