:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Lizi, Davoodi, Azadeh
Format:	Preprint
Published:	2024
Subjects:	Hardware Architecture Artificial Intelligence
Online Access:	https://arxiv.org/abs/2408.03292
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Global and Local Attention-based Inception U-Net for Static IR Drop Prediction
by: Chen, Yilu, et al.
Published: (2024)

Estimating Voltage Drop: Models, Features and Data Representation Towards a Neural Surrogate
by: Jin, Yifei, et al.
Published: (2025)

SystolicAttention: Fusing FlashAttention within a Single Systolic Array
by: Lin, Jiawei, et al.
Published: (2025)

Salca: A Sparsity-Aware Hardware Accelerator for Efficient Long-Context Attention Decoding
by: Fan, Wang, et al.
Published: (2026)

ApproXAI: Energy-Efficient Hardware Acceleration of Explainable AI using Approximate Computing
by: Siddique, Ayesha, et al.
Published: (2025)

VeriHGN: Heterogeneous Graph-Based Congestion Prediction for Chip Layout Verification
by: Hu, Runbang, et al.
Published: (2026)

Reducing the Cost of Dropout in Flash-Attention by Hiding RNG with GEMM
by: Ma, Haiyue, et al.
Published: (2024)

SWAT: Scalable and Efficient Window Attention-based Transformers Acceleration on FPGAs
by: Bai, Zhenyu, et al.
Published: (2024)

TReX- Reusing Vision Transformer's Attention for Efficient Xbar-based Computing
by: Moitra, Abhishek, et al.
Published: (2024)

Rethinking LLM Inference Bottlenecks: Insights from Latent Attention and Mixture-of-Experts
by: Yun, Sungmin, et al.
Published: (2025)

MAGNet: A Multi-Scale Attention-Guided Graph Fusion Network for DRC Violation Detection
by: Lu, Weihan, et al.
Published: (2025)

From Buffers to Registers: Unlocking Fine-Grained FlashAttention with Hybrid-Bonded 3D NPU Co-Design
by: Yu, Jinxin, et al.
Published: (2026)

AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM
by: Zhang, Yuanpeng, et al.
Published: (2025)

Classification-Based Automatic HDL Code Generation Using LLMs
by: Sun, Wenhao, et al.
Published: (2024)

Circuit Diagram Retrieval Based on Hierarchical Circuit Graph Representation
by: Gao, Ming, et al.
Published: (2025)

Characterizing Soft-Error Resiliency in Arm's Ethos-U55 Embedded Machine Learning Accelerator
by: Tyagi, Abhishek, et al.
Published: (2024)

POET: Power-Oriented Evolutionary Tuning for LLM-Based RTL PPA Optimization
by: Ping, Heng, et al.
Published: (2026)

A Two Level Neural Approach Combining Off-Chip Prediction with Adaptive Prefetch Filtering
by: Jamet, Alexandre Valentin, et al.
Published: (2024)

Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms
by: Wang, Zhihai, et al.
Published: (2024)

TurboAttention: Efficient Attention Approximation For High Throughputs LLMs
by: Kang, Hao, et al.
Published: (2024)

VeriLoC: Line-of-Code Level Prediction of Hardware Design Quality from Verilog Code
by: Hemadri, Raghu Vamshi, et al.
Published: (2025)

Enabling On-Device Medical AI Assistants via Input-Driven Saliency Adaptation
by: Kallakurik, Uttej, et al.
Published: (2025)

FPGA-Based Neural Network Accelerators for Space Applications: A Survey
by: Antunes, Pedro, et al.
Published: (2025)

Chiplet-Based RISC-V SoC with Modular AI Acceleration
by: Bharadwaj, Suhas Suresh, et al.
Published: (2025)

Sangam: Chiplet-Based DRAM-PIM Accelerator with CXL Integration for LLM Inferencing
by: Kiyawat, Khyati, et al.
Published: (2025)

Learning in Log-Domain: Subthreshold Analog AI Accelerator Based on Stochastic Gradient Descent
by: Tageldeen, Momen K, et al.
Published: (2025)

Exploration of Unary Arithmetic-Based Matrix Multiply Units for Low Precision DL Accelerators
by: Vellaisamy, Prabhu, et al.
Published: (2026)

Explainable AI-Guided Efficient Approximate DNN Generation for Multi-Pod Systolic Arrays
by: Siddique, Ayesha, et al.
Published: (2025)

EdgeCIM: A Hardware-Software Co-Design for CIM-Based Acceleration of Small Language Models
by: Bazzi, Jinane, et al.
Published: (2026)

Optimizing Neural Networks with Learnable Non-Linear Activation Functions via Lookup-Based FPGA Acceleration
by: Yin, Mengyuan, et al.
Published: (2025)

Idle is the New Sleep: Configuration-Aware Alternative to Powering Off FPGA-Based DL Accelerators During Inactivity
by: Qian, Chao, et al.
Published: (2024)

HYPERHEURIST: A Simulated Annealing-Based Control Framework for LLM-Driven Code Generation in Optimized Hardware Design
by: Ahir, Shiva, et al.
Published: (2026)

At the Edge of the Heart: ULP FPGA-Based CNN for On-Device Cardiac Feature Extraction in Smart Health Sensors for Astronauts
by: Rahman, Kazi Mohammad Abidur, et al.
Published: (2026)

EMSpice 3: Full-chip Temperature-Aware Multiphysics Electromigration and IR-Drop Analysis
by: Lu, Haotian, et al.
Published: (2026)

DeepV: A Model-Agnostic Retrieval-Augmented Framework for Verilog Code Generation with a High-Quality Knowledge Base
by: Ibnat, Zahin, et al.
Published: (2025)

J3DAI: A tiny DNN-Based Edge AI Accelerator for 3D-Stacked CMOS Image Sensor
by: Tain, Benoit, et al.
Published: (2025)

Dynamic Sparse Attention: Access Patterns and Architecture
by: Levy, Noam
Published: (2026)

Self-Attention to Operator Learning-based 3D-IC Thermal Simulation
by: Huang, Zhen, et al.
Published: (2025)

Comprehensive Design Space Exploration for Tensorized Neural Network Hardware Accelerators
by: Zhang, Jinsong, et al.
Published: (2025)

KirchhoffNet: A Scalable Ultra Fast Analog Neural Network
by: Gao, Zhengqi, et al.
Published: (2023)