Saved in:
| Main Authors: | Zhang, Lizi, Davoodi, Azadeh |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.03292 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Global and Local Attention-based Inception U-Net for Static IR Drop Prediction
by: Chen, Yilu, et al.
Published: (2024)
by: Chen, Yilu, et al.
Published: (2024)
Estimating Voltage Drop: Models, Features and Data Representation Towards a Neural Surrogate
by: Jin, Yifei, et al.
Published: (2025)
by: Jin, Yifei, et al.
Published: (2025)
SystolicAttention: Fusing FlashAttention within a Single Systolic Array
by: Lin, Jiawei, et al.
Published: (2025)
by: Lin, Jiawei, et al.
Published: (2025)
Salca: A Sparsity-Aware Hardware Accelerator for Efficient Long-Context Attention Decoding
by: Fan, Wang, et al.
Published: (2026)
by: Fan, Wang, et al.
Published: (2026)
ApproXAI: Energy-Efficient Hardware Acceleration of Explainable AI using Approximate Computing
by: Siddique, Ayesha, et al.
Published: (2025)
by: Siddique, Ayesha, et al.
Published: (2025)
VeriHGN: Heterogeneous Graph-Based Congestion Prediction for Chip Layout Verification
by: Hu, Runbang, et al.
Published: (2026)
by: Hu, Runbang, et al.
Published: (2026)
Reducing the Cost of Dropout in Flash-Attention by Hiding RNG with GEMM
by: Ma, Haiyue, et al.
Published: (2024)
by: Ma, Haiyue, et al.
Published: (2024)
SWAT: Scalable and Efficient Window Attention-based Transformers Acceleration on FPGAs
by: Bai, Zhenyu, et al.
Published: (2024)
by: Bai, Zhenyu, et al.
Published: (2024)
TReX- Reusing Vision Transformer's Attention for Efficient Xbar-based Computing
by: Moitra, Abhishek, et al.
Published: (2024)
by: Moitra, Abhishek, et al.
Published: (2024)
Rethinking LLM Inference Bottlenecks: Insights from Latent Attention and Mixture-of-Experts
by: Yun, Sungmin, et al.
Published: (2025)
by: Yun, Sungmin, et al.
Published: (2025)
MAGNet: A Multi-Scale Attention-Guided Graph Fusion Network for DRC Violation Detection
by: Lu, Weihan, et al.
Published: (2025)
by: Lu, Weihan, et al.
Published: (2025)
From Buffers to Registers: Unlocking Fine-Grained FlashAttention with Hybrid-Bonded 3D NPU Co-Design
by: Yu, Jinxin, et al.
Published: (2026)
by: Yu, Jinxin, et al.
Published: (2026)
AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM
by: Zhang, Yuanpeng, et al.
Published: (2025)
by: Zhang, Yuanpeng, et al.
Published: (2025)
Classification-Based Automatic HDL Code Generation Using LLMs
by: Sun, Wenhao, et al.
Published: (2024)
by: Sun, Wenhao, et al.
Published: (2024)
Circuit Diagram Retrieval Based on Hierarchical Circuit Graph Representation
by: Gao, Ming, et al.
Published: (2025)
by: Gao, Ming, et al.
Published: (2025)
Characterizing Soft-Error Resiliency in Arm's Ethos-U55 Embedded Machine Learning Accelerator
by: Tyagi, Abhishek, et al.
Published: (2024)
by: Tyagi, Abhishek, et al.
Published: (2024)
POET: Power-Oriented Evolutionary Tuning for LLM-Based RTL PPA Optimization
by: Ping, Heng, et al.
Published: (2026)
by: Ping, Heng, et al.
Published: (2026)
A Two Level Neural Approach Combining Off-Chip Prediction with Adaptive Prefetch Filtering
by: Jamet, Alexandre Valentin, et al.
Published: (2024)
by: Jamet, Alexandre Valentin, et al.
Published: (2024)
Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms
by: Wang, Zhihai, et al.
Published: (2024)
by: Wang, Zhihai, et al.
Published: (2024)
TurboAttention: Efficient Attention Approximation For High Throughputs LLMs
by: Kang, Hao, et al.
Published: (2024)
by: Kang, Hao, et al.
Published: (2024)
VeriLoC: Line-of-Code Level Prediction of Hardware Design Quality from Verilog Code
by: Hemadri, Raghu Vamshi, et al.
Published: (2025)
by: Hemadri, Raghu Vamshi, et al.
Published: (2025)
Enabling On-Device Medical AI Assistants via Input-Driven Saliency Adaptation
by: Kallakurik, Uttej, et al.
Published: (2025)
by: Kallakurik, Uttej, et al.
Published: (2025)
FPGA-Based Neural Network Accelerators for Space Applications: A Survey
by: Antunes, Pedro, et al.
Published: (2025)
by: Antunes, Pedro, et al.
Published: (2025)
Chiplet-Based RISC-V SoC with Modular AI Acceleration
by: Bharadwaj, Suhas Suresh, et al.
Published: (2025)
by: Bharadwaj, Suhas Suresh, et al.
Published: (2025)
Sangam: Chiplet-Based DRAM-PIM Accelerator with CXL Integration for LLM Inferencing
by: Kiyawat, Khyati, et al.
Published: (2025)
by: Kiyawat, Khyati, et al.
Published: (2025)
Learning in Log-Domain: Subthreshold Analog AI Accelerator Based on Stochastic Gradient Descent
by: Tageldeen, Momen K, et al.
Published: (2025)
by: Tageldeen, Momen K, et al.
Published: (2025)
Exploration of Unary Arithmetic-Based Matrix Multiply Units for Low Precision DL Accelerators
by: Vellaisamy, Prabhu, et al.
Published: (2026)
by: Vellaisamy, Prabhu, et al.
Published: (2026)
Explainable AI-Guided Efficient Approximate DNN Generation for Multi-Pod Systolic Arrays
by: Siddique, Ayesha, et al.
Published: (2025)
by: Siddique, Ayesha, et al.
Published: (2025)
EdgeCIM: A Hardware-Software Co-Design for CIM-Based Acceleration of Small Language Models
by: Bazzi, Jinane, et al.
Published: (2026)
by: Bazzi, Jinane, et al.
Published: (2026)
Optimizing Neural Networks with Learnable Non-Linear Activation Functions via Lookup-Based FPGA Acceleration
by: Yin, Mengyuan, et al.
Published: (2025)
by: Yin, Mengyuan, et al.
Published: (2025)
Idle is the New Sleep: Configuration-Aware Alternative to Powering Off FPGA-Based DL Accelerators During Inactivity
by: Qian, Chao, et al.
Published: (2024)
by: Qian, Chao, et al.
Published: (2024)
HYPERHEURIST: A Simulated Annealing-Based Control Framework for LLM-Driven Code Generation in Optimized Hardware Design
by: Ahir, Shiva, et al.
Published: (2026)
by: Ahir, Shiva, et al.
Published: (2026)
At the Edge of the Heart: ULP FPGA-Based CNN for On-Device Cardiac Feature Extraction in Smart Health Sensors for Astronauts
by: Rahman, Kazi Mohammad Abidur, et al.
Published: (2026)
by: Rahman, Kazi Mohammad Abidur, et al.
Published: (2026)
EMSpice 3: Full-chip Temperature-Aware Multiphysics Electromigration and IR-Drop Analysis
by: Lu, Haotian, et al.
Published: (2026)
by: Lu, Haotian, et al.
Published: (2026)
DeepV: A Model-Agnostic Retrieval-Augmented Framework for Verilog Code Generation with a High-Quality Knowledge Base
by: Ibnat, Zahin, et al.
Published: (2025)
by: Ibnat, Zahin, et al.
Published: (2025)
J3DAI: A tiny DNN-Based Edge AI Accelerator for 3D-Stacked CMOS Image Sensor
by: Tain, Benoit, et al.
Published: (2025)
by: Tain, Benoit, et al.
Published: (2025)
Dynamic Sparse Attention: Access Patterns and Architecture
by: Levy, Noam
Published: (2026)
by: Levy, Noam
Published: (2026)
Self-Attention to Operator Learning-based 3D-IC Thermal Simulation
by: Huang, Zhen, et al.
Published: (2025)
by: Huang, Zhen, et al.
Published: (2025)
Comprehensive Design Space Exploration for Tensorized Neural Network Hardware Accelerators
by: Zhang, Jinsong, et al.
Published: (2025)
by: Zhang, Jinsong, et al.
Published: (2025)
KirchhoffNet: A Scalable Ultra Fast Analog Neural Network
by: Gao, Zhengqi, et al.
Published: (2023)
by: Gao, Zhengqi, et al.
Published: (2023)
Similar Items
-
Global and Local Attention-based Inception U-Net for Static IR Drop Prediction
by: Chen, Yilu, et al.
Published: (2024) -
Estimating Voltage Drop: Models, Features and Data Representation Towards a Neural Surrogate
by: Jin, Yifei, et al.
Published: (2025) -
SystolicAttention: Fusing FlashAttention within a Single Systolic Array
by: Lin, Jiawei, et al.
Published: (2025) -
Salca: A Sparsity-Aware Hardware Accelerator for Efficient Long-Context Attention Decoding
by: Fan, Wang, et al.
Published: (2026) -
ApproXAI: Energy-Efficient Hardware Acceleration of Explainable AI using Approximate Computing
by: Siddique, Ayesha, et al.
Published: (2025)