Saved in:
| Main Authors: | Yang, Yu, González, Jordi Altayó, Delestrac, Paul, Hemani, Ahmed |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.00207 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Addressing memory bandwidth scalability in vector processors for streaming applications
by: Altayo, Jordi, et al.
Published: (2025)
by: Altayo, Jordi, et al.
Published: (2025)
'1'-bit Count-based Sorting Unit to Reduce Link Power in DNN Accelerators
by: Han, Ruichi, et al.
Published: (2026)
by: Han, Ruichi, et al.
Published: (2026)
pHNSW: PCA-Based Filtering to Accelerate HNSW Approximate Nearest Neighbor Search
by: Li, Zheng, et al.
Published: (2026)
by: Li, Zheng, et al.
Published: (2026)
Late Breaking Results: Quamba-SE: Soft-edge Quantizer for Activations in State Space Models
by: Chen, Yizhi, et al.
Published: (2026)
by: Chen, Yizhi, et al.
Published: (2026)
Demystifying the 7-D Convolution Loop Nest for Data and Instruction Streaming in Reconfigurable AI Accelerators
by: Chowdhury, Md Rownak Hossain, et al.
Published: (2025)
by: Chowdhury, Md Rownak Hossain, et al.
Published: (2025)
Mozart: A Chiplet Ecosystem-Accelerator Codesign Framework for Composable Bespoke Application Specific Integrated Circuits
by: Jin, Haoran, et al.
Published: (2025)
by: Jin, Haoran, et al.
Published: (2025)
ARISE: Automating RISC-V Instruction Set Extension
by: Hager-Clukas, Andreas, et al.
Published: (2025)
by: Hager-Clukas, Andreas, et al.
Published: (2025)
A Dense and Efficient Instruction Set Architecture Encoding
by: Maroun, Emad Jacob
Published: (2025)
by: Maroun, Emad Jacob
Published: (2025)
Sparsity-Aware Streaming SNN Accelerator with Output-Channel Dataflow for Automatic Modulation Classification
by: Yang, Kuilian, et al.
Published: (2026)
by: Yang, Kuilian, et al.
Published: (2026)
Composing Mini Oscilloscope on Embedded Systems
by: Romero, Brennan, et al.
Published: (2025)
by: Romero, Brennan, et al.
Published: (2025)
FILCO: Flexible Composing Architecture with Real-Time Reconfigurability for DNN Acceleration
by: Chen, Xingzhen, et al.
Published: (2026)
by: Chen, Xingzhen, et al.
Published: (2026)
Rainbow: A Composable Coherence Protocol for Multi-Chip Servers
by: Menezo, Lucia G., et al.
Published: (2020)
by: Menezo, Lucia G., et al.
Published: (2020)
Control Flow Management in Modern GPUs
by: Shoushtary, Mojtaba Abaie, et al.
Published: (2024)
by: Shoushtary, Mojtaba Abaie, et al.
Published: (2024)
StreamGrid: Streaming Point Cloud Analytics via Compulsory Splitting and Deterministic Termination
by: Feng, Yu, et al.
Published: (2025)
by: Feng, Yu, et al.
Published: (2025)
Reconfigurable Quantum Instruction Set Computers for High Performance Attainable on Hardware
by: Yang, Zhaohui, et al.
Published: (2025)
by: Yang, Zhaohui, et al.
Published: (2025)
Mitigating the Bandwidth Wall via Data-Streaming System-Accelerator Co-Design
by: Liu, Qunyou, et al.
Published: (2026)
by: Liu, Qunyou, et al.
Published: (2026)
StreamTensor: Make Tensors Stream in Dataflow Accelerators for LLMs
by: Ye, Hanchen, et al.
Published: (2025)
by: Ye, Hanchen, et al.
Published: (2025)
DataMaestro: A Versatile and Efficient Data Streaming Engine Bringing Decoupled Memory Access To Dataflow Accelerators
by: Yi, Xiaoling, et al.
Published: (2025)
by: Yi, Xiaoling, et al.
Published: (2025)
Pedagogically Motivated and Composable Open-Source RISC-V Processors for Computer Science Education
by: McDougall, Ian, et al.
Published: (2025)
by: McDougall, Ian, et al.
Published: (2025)
fSEAD: a Composable FPGA-based Streaming Ensemble Anomaly Detection Library
by: Lou, Binglei, et al.
Published: (2024)
by: Lou, Binglei, et al.
Published: (2024)
A Composable Dynamic Sparse Dataflow Architecture for Efficient Event-based Vision Processing on FPGA
by: Gao, Yizhao, et al.
Published: (2024)
by: Gao, Yizhao, et al.
Published: (2024)
GraCo -- A Graph Composer for Integrated Circuits
by: Uhlich, Stefan, et al.
Published: (2024)
by: Uhlich, Stefan, et al.
Published: (2024)
Algorithms for Improving the Automatically Synthesized Instruction Set of an Extensible Processor
by: Sovietov, Peter
Published: (2024)
by: Sovietov, Peter
Published: (2024)
Reconfigurable Stream Network Architecture
by: Wang, Chengyue, et al.
Published: (2024)
by: Wang, Chengyue, et al.
Published: (2024)
SpikeStream: Accelerating Spiking Neural Network Inference on RISC-V Clusters with Sparse Computation Extensions
by: Manoni, Simone, et al.
Published: (2025)
by: Manoni, Simone, et al.
Published: (2025)
DORA: Dataflow-Instruction Orchestration Architecture for DNN Acceleration
by: Chen, Xingzhen, et al.
Published: (2026)
by: Chen, Xingzhen, et al.
Published: (2026)
MINISA: Minimal Instruction Set Architecture for Next-gen Reconfigurable Inference Accelerator
by: Tong, Jianming, et al.
Published: (2026)
by: Tong, Jianming, et al.
Published: (2026)
Stream-HLS: Towards Automatic Dataflow Acceleration
by: Basalama, Suhail, et al.
Published: (2025)
by: Basalama, Suhail, et al.
Published: (2025)
StreamDCIM: A Tile-based Streaming Digital CIM Accelerator with Mixed-stationary Cross-forwarding Dataflow for Multimodal Transformer
by: Qin, Shantian, et al.
Published: (2025)
by: Qin, Shantian, et al.
Published: (2025)
Garibaldi: A Pairwise Instruction-Data Management for Enhancing Shared Last-Level Cache Performance in Server Workloads
by: Kwon, Jaewon, et al.
Published: (2025)
by: Kwon, Jaewon, et al.
Published: (2025)
Instruction Scheduling in the Saturn Vector Unit
by: Zhao, Jerry, et al.
Published: (2024)
by: Zhao, Jerry, et al.
Published: (2024)
Implementing and Optimizing the Scaled Dot-Product Attention on Streaming Dataflow
by: Sohn, Gina, et al.
Published: (2024)
by: Sohn, Gina, et al.
Published: (2024)
A Quantitative Analysis and Guidelines of Data Streaming Accelerator in Modern Intel Xeon Scalable Processors
by: Kuper, Reese, et al.
Published: (2023)
by: Kuper, Reese, et al.
Published: (2023)
FastFlow in FPGA Stacks of Data Centers
by: Paul, Rourab, et al.
Published: (2024)
by: Paul, Rourab, et al.
Published: (2024)
Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models
by: Wei, Chiyue, et al.
Published: (2025)
by: Wei, Chiyue, et al.
Published: (2025)
Multi-Dimensional Reconfigurable, Physically Composable Hybrid Diffractive Optical Neural Network
by: Yin, Ziang, et al.
Published: (2024)
by: Yin, Ziang, et al.
Published: (2024)
Shared-PIM: Enabling Concurrent Computation and Data Flow for Faster Processing-in-DRAM
by: Mamdouh, Ahmed, et al.
Published: (2024)
by: Mamdouh, Ahmed, et al.
Published: (2024)
LUTstructions: Self-loading FPGA-based Reconfigurable Instructions
by: Papaphilippou, Philippos
Published: (2026)
by: Papaphilippou, Philippos
Published: (2026)
Efficient Implementation of RISC-V Vector Permutation Instructions
by: Titopoulos, Vasileios, et al.
Published: (2025)
by: Titopoulos, Vasileios, et al.
Published: (2025)
Stream: Design Space Exploration of Layer-Fused DNNs on Heterogeneous Dataflow Accelerators
by: Symons, Arne, et al.
Published: (2022)
by: Symons, Arne, et al.
Published: (2022)
Similar Items
-
Addressing memory bandwidth scalability in vector processors for streaming applications
by: Altayo, Jordi, et al.
Published: (2025) -
'1'-bit Count-based Sorting Unit to Reduce Link Power in DNN Accelerators
by: Han, Ruichi, et al.
Published: (2026) -
pHNSW: PCA-Based Filtering to Accelerate HNSW Approximate Nearest Neighbor Search
by: Li, Zheng, et al.
Published: (2026) -
Late Breaking Results: Quamba-SE: Soft-edge Quantizer for Activations in State Space Models
by: Chen, Yizhi, et al.
Published: (2026) -
Demystifying the 7-D Convolution Loop Nest for Data and Instruction Streaming in Reconfigurable AI Accelerators
by: Chowdhury, Md Rownak Hossain, et al.
Published: (2025)