Saved in:
| Main Authors: | Yang, Kuilian, Zhang, Li, Eltawil, Ahmed M., Salama, Khaled Nabil |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.02613 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Efficient IMC Accelerator Design Through Joint Hardware-Workload Co-optimization
by: Krestinskaya, Olga, et al.
Published: (2024)
by: Krestinskaya, Olga, et al.
Published: (2024)
Stream-HLS: Towards Automatic Dataflow Acceleration
by: Basalama, Suhail, et al.
Published: (2025)
by: Basalama, Suhail, et al.
Published: (2025)
HASS: Hardware-Aware Sparsity Search for Dataflow DNN Accelerator
by: Yu, Zhewen, et al.
Published: (2024)
by: Yu, Zhewen, et al.
Published: (2024)
Multilayer Dataflow: Orchestrate Butterfly Sparsity to Accelerate Attention Computation
by: Wu, Haibin, et al.
Published: (2024)
by: Wu, Haibin, et al.
Published: (2024)
CIMNAS: A Joint Framework for Compute-In-Memory-Aware Neural Architecture Search
by: Krestinskaya, Olga, et al.
Published: (2025)
by: Krestinskaya, Olga, et al.
Published: (2025)
StreamTensor: Make Tensors Stream in Dataflow Accelerators for LLMs
by: Ye, Hanchen, et al.
Published: (2025)
by: Ye, Hanchen, et al.
Published: (2025)
Joint Hardware-Workload Co-Optimization for In-Memory Computing Accelerators
by: Krestinskaya, Olga, et al.
Published: (2026)
by: Krestinskaya, Olga, et al.
Published: (2026)
A Sparsity-Aware Autonomous Path Planning Accelerator with HW/SW Co-Design and Multi-Level Dataflow Optimization
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
FlexNeRFer: A Multi-Dataflow, Adaptive Sparsity-Aware Accelerator for On-Device NeRF Rendering
by: Noh, Seock-Hwan, et al.
Published: (2025)
by: Noh, Seock-Hwan, et al.
Published: (2025)
Stream: Design Space Exploration of Layer-Fused DNNs on Heterogeneous Dataflow Accelerators
by: Symons, Arne, et al.
Published: (2022)
by: Symons, Arne, et al.
Published: (2022)
StreamDCIM: A Tile-based Streaming Digital CIM Accelerator with Mixed-stationary Cross-forwarding Dataflow for Multimodal Transformer
by: Qin, Shantian, et al.
Published: (2025)
by: Qin, Shantian, et al.
Published: (2025)
Implementing and Optimizing the Scaled Dot-Product Attention on Streaming Dataflow
by: Sohn, Gina, et al.
Published: (2024)
by: Sohn, Gina, et al.
Published: (2024)
DORA: Dataflow-Instruction Orchestration Architecture for DNN Acceleration
by: Chen, Xingzhen, et al.
Published: (2026)
by: Chen, Xingzhen, et al.
Published: (2026)
Exploring the Sparsity-Quantization Interplay on a Novel Hybrid SNN Event-Driven Architecture
by: Aliyev, Ilkin, et al.
Published: (2024)
by: Aliyev, Ilkin, et al.
Published: (2024)
DataMaestro: A Versatile and Efficient Data Streaming Engine Bringing Decoupled Memory Access To Dataflow Accelerators
by: Yi, Xiaoling, et al.
Published: (2025)
by: Yi, Xiaoling, et al.
Published: (2025)
AccelCIM: Systematic Dataflow Exploration for SRAM Compute-in-Memory Accelerator
by: Xue, Chenhao, et al.
Published: (2026)
by: Xue, Chenhao, et al.
Published: (2026)
A High-Throughput FPGA Accelerator for Lightweight CNNs With Balanced Dataflow
by: Zhao, Zhiyuan, et al.
Published: (2024)
by: Zhao, Zhiyuan, et al.
Published: (2024)
MIREDO: MIP-Driven Resource-Efficient Dataflow Optimization for Computing-in-Memory Accelerator
by: He, Xiaolin, et al.
Published: (2025)
by: He, Xiaolin, et al.
Published: (2025)
Surrogates, Spikes, and Sparsity: Performance Analysis and Characterization of SNN Hyperparameters on Hardware
by: Aliyev, Ilkin, et al.
Published: (2026)
by: Aliyev, Ilkin, et al.
Published: (2026)
STI-SNN: A 0.14 GOPS/W/PE Single-Timestep Inference FPGA-based SNN Accelerator with Algorithm and Hardware Co-Design
by: Wang, Kainan, et al.
Published: (2025)
by: Wang, Kainan, et al.
Published: (2025)
LLMulator: Generalizable Cost Modeling for Dataflow Accelerators with Input-Adaptive Control Flow
by: Chang, Kaiyan, et al.
Published: (2025)
by: Chang, Kaiyan, et al.
Published: (2025)
BF-IMNA: A Bit Fluid In-Memory Neural Architecture for Neural Network Acceleration
by: Rakka, Mariam, et al.
Published: (2024)
by: Rakka, Mariam, et al.
Published: (2024)
Low Power Vision Transformer Accelerator with Hardware-Aware Pruning and Optimized Dataflow
by: Hsiung, Ching-Lin, et al.
Published: (2025)
by: Hsiung, Ching-Lin, et al.
Published: (2025)
LoopTree: Exploring the Fused-layer Dataflow Accelerator Design Space
by: Gilbert, Michael, et al.
Published: (2024)
by: Gilbert, Michael, et al.
Published: (2024)
SnapStream: Efficient Long Sequence Decoding on Dataflow Accelerators
by: Li, Jonathan, et al.
Published: (2025)
by: Li, Jonathan, et al.
Published: (2025)
Prosperity: Accelerating Spiking Neural Networks via Product Sparsity
by: Wei, Chiyue, et al.
Published: (2025)
by: Wei, Chiyue, et al.
Published: (2025)
EdgeCIM: A Hardware-Software Co-Design for CIM-Based Acceleration of Small Language Models
by: Bazzi, Jinane, et al.
Published: (2026)
by: Bazzi, Jinane, et al.
Published: (2026)
DAS-MP: Enabling High-Quality Macro Placement with Enhanced Dataflow Awareness
by: Zhao, Xiaotian, et al.
Published: (2025)
by: Zhao, Xiaotian, et al.
Published: (2025)
CODO: An Automated Compiler for Comprehensive Dataflow Optimization
by: Zhang, Weichuang, et al.
Published: (2026)
by: Zhang, Weichuang, et al.
Published: (2026)
SIRA: Scaled-Integer Range Analysis for Optimizing FPGA Dataflow Neural Network Accelerators
by: Umuroglu, Yaman, et al.
Published: (2025)
by: Umuroglu, Yaman, et al.
Published: (2025)
VEDA: Efficient LLM Generation Through Voting-based KV Cache Eviction and Dataflow-flexible Accelerator
by: Wang, Zhican, et al.
Published: (2025)
by: Wang, Zhican, et al.
Published: (2025)
SOFA: A Compute-Memory Optimized Sparsity Accelerator via Cross-Stage Coordinated Tiling
by: Wang, Huizheng, et al.
Published: (2024)
by: Wang, Huizheng, et al.
Published: (2024)
VESTA: A Versatile SNN-Based Transformer Accelerator with Unified PEs for Multiple Computational Layers
by: Chen, Ching-Yao, et al.
Published: (2025)
by: Chen, Ching-Yao, et al.
Published: (2025)
FEATHER: A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching
by: Tong, Jianming, et al.
Published: (2024)
by: Tong, Jianming, et al.
Published: (2024)
FlatAttention: Dataflow and Fabric Collectives Co-Optimization for Large Attention-Based Model Inference on Tile-Based Accelerators
by: Zhang, Chi, et al.
Published: (2026)
by: Zhang, Chi, et al.
Published: (2026)
FlatAttention: Dataflow and Fabric Collectives Co-Optimization for Efficient Multi-Head Attention on Tile-Based Many-PE Accelerators
by: Zhang, Chi, et al.
Published: (2025)
by: Zhang, Chi, et al.
Published: (2025)
Accelerating Recommender Model ETL with a Streaming FPGA-GPU Dataflow
by: Zhu, Yu, et al.
Published: (2025)
by: Zhu, Yu, et al.
Published: (2025)
SATA: Sparsity-Aware Scheduling for Selective Token Attention
by: Fan, Zhenkun, et al.
Published: (2026)
by: Fan, Zhenkun, et al.
Published: (2026)
Revealing CNN Architectures via Side-Channel Analysis in Dataflow-based Inference Accelerators
by: Weerasena, Hansika, et al.
Published: (2023)
by: Weerasena, Hansika, et al.
Published: (2023)
Salca: A Sparsity-Aware Hardware Accelerator for Efficient Long-Context Attention Decoding
by: Fan, Wang, et al.
Published: (2026)
by: Fan, Wang, et al.
Published: (2026)
Similar Items
-
Towards Efficient IMC Accelerator Design Through Joint Hardware-Workload Co-optimization
by: Krestinskaya, Olga, et al.
Published: (2024) -
Stream-HLS: Towards Automatic Dataflow Acceleration
by: Basalama, Suhail, et al.
Published: (2025) -
HASS: Hardware-Aware Sparsity Search for Dataflow DNN Accelerator
by: Yu, Zhewen, et al.
Published: (2024) -
Multilayer Dataflow: Orchestrate Butterfly Sparsity to Accelerate Attention Computation
by: Wu, Haibin, et al.
Published: (2024) -
CIMNAS: A Joint Framework for Compute-In-Memory-Aware Neural Architecture Search
by: Krestinskaya, Olga, et al.
Published: (2025)