Saved in:
| Main Authors: | Wickramasinghe, Sachini, Ye, Tian, Raghavendra, Cauligi, Prasanna, Viktor |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.03598 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VTR: An Optimized Vision Transformer for SAR ATR Acceleration on FPGA
by: Wickramasinghe, Sachini, et al.
Published: (2024)
by: Wickramasinghe, Sachini, et al.
Published: (2024)
Efficient and Accurate Graph Classification with Hyperdimensional Computing on FPGA
by: Arockiaraj, Jebacyril, et al.
Published: (2025)
by: Arockiaraj, Jebacyril, et al.
Published: (2025)
FAST-Prefill: FPGA Accelerated Sparse Attention for Long Context LLM Prefill
by: Jayanth, Rakshith, et al.
Published: (2026)
by: Jayanth, Rakshith, et al.
Published: (2026)
Harmonia: Algorithm-Hardware Co-Design for Memory- and Compute-Efficient BFP-based LLM Inference
by: Wang, Xinyu, et al.
Published: (2026)
by: Wang, Xinyu, et al.
Published: (2026)
CogSys: Efficient and Scalable Neurosymbolic Cognition System via Algorithm-Hardware Co-Design
by: Wan, Zishen, et al.
Published: (2025)
by: Wan, Zishen, et al.
Published: (2025)
Algorithm and Hardware Co-Design for Efficient Complex-Valued Uncertainty Estimation
by: Zhang, Zehuan, et al.
Published: (2026)
by: Zhang, Zehuan, et al.
Published: (2026)
Co-Design of CNN Accelerators for TinyML using Approximate Matrix Decomposition
by: Morales, José Juan Hernández, et al.
Published: (2026)
by: Morales, José Juan Hernández, et al.
Published: (2026)
GenDRAM:Hardware-Software Co-Design of General Platform in DRAM
by: Lu, Tsung-Han, et al.
Published: (2026)
by: Lu, Tsung-Han, et al.
Published: (2026)
Finesse: An Agile Design Framework for Pairing-based Cryptography via Software/Hardware Co-Design
by: Pan, Tianwei, et al.
Published: (2025)
by: Pan, Tianwei, et al.
Published: (2025)
METRO: A Software-Hardware Co-Design of Interconnections for Spatial DNN Accelerators
by: Wang, Zhao, et al.
Published: (2021)
by: Wang, Zhao, et al.
Published: (2021)
Hardware-Software Co-Design for Accelerating Transformer Inference Leveraging Compute-in-Memory
by: Kim, Dong Eun, et al.
Published: (2025)
by: Kim, Dong Eun, et al.
Published: (2025)
NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering
by: Zhou, Zhe, et al.
Published: (2024)
by: Zhou, Zhe, et al.
Published: (2024)
Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations
by: Fyon, Arthur, et al.
Published: (2026)
by: Fyon, Arthur, et al.
Published: (2026)
Cerberus: Cross-Layer ECC Co-Design for Robust and Efficient Memory Protection
by: Kim, Junhwan, et al.
Published: (2026)
by: Kim, Junhwan, et al.
Published: (2026)
Hardware Efficient Accelerator for Spiking Transformer With Reconfigurable Parallel Time Step Computing
by: Chen, Bo-Yu, et al.
Published: (2025)
by: Chen, Bo-Yu, et al.
Published: (2025)
TTP: A Hardware-Efficient Design for Precise Prefetching in Ray Tracing
by: Tozlu, Yavuz Selim, et al.
Published: (2026)
by: Tozlu, Yavuz Selim, et al.
Published: (2026)
MERE: Hardware-Software Co-Design for Masking Cache Miss Latency in Embedded Processors
by: You, Dean, et al.
Published: (2025)
by: You, Dean, et al.
Published: (2025)
PIM-FW: Hardware-Software Co-Design of All-pairs Shortest Paths in DRAM
by: Lu, Tsung-Han, et al.
Published: (2025)
by: Lu, Tsung-Han, et al.
Published: (2025)
FETTA: Flexible and Efficient Hardware Accelerator for Tensorized Neural Network Training
by: Lu, Jinming, et al.
Published: (2025)
by: Lu, Jinming, et al.
Published: (2025)
Towards Efficient IMC Accelerator Design Through Joint Hardware-Workload Co-optimization
by: Krestinskaya, Olga, et al.
Published: (2024)
by: Krestinskaya, Olga, et al.
Published: (2024)
Palermo: Improving the Performance of Oblivious Memory using Protocol-Hardware Co-Design
by: Ye, Haojie, et al.
Published: (2024)
by: Ye, Haojie, et al.
Published: (2024)
Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models
by: Vahdatpour, Mohammad Saleh, et al.
Published: (2026)
by: Vahdatpour, Mohammad Saleh, et al.
Published: (2026)
SD-Acc: Accelerating Stable Diffusion through Phase-aware Sampling and Hardware Co-Optimizations
by: Wang, Zhican, et al.
Published: (2025)
by: Wang, Zhican, et al.
Published: (2025)
Aquas: Enhancing Domain Specialization through Holistic Hardware-Software Co-Optimization based on MLIR
by: Zou, Yuyang, et al.
Published: (2025)
by: Zou, Yuyang, et al.
Published: (2025)
Hardware-Software Co-Design for Event-Driven SNN Deployment on Low-Cost Neuromorphic FPGAs
by: Lee, Jiwoon, et al.
Published: (2026)
by: Lee, Jiwoon, et al.
Published: (2026)
EdgeCIM: A Hardware-Software Co-Design for CIM-Based Acceleration of Small Language Models
by: Bazzi, Jinane, et al.
Published: (2026)
by: Bazzi, Jinane, et al.
Published: (2026)
PaCKD: Pattern-Clustered Knowledge Distillation for Compressing Memory Access Prediction Models
by: Gupta, Neelesh, et al.
Published: (2024)
by: Gupta, Neelesh, et al.
Published: (2024)
Enabling Long FFT Convolutions on Memory-Constrained FPGAs via Chunking
by: Wang, Peter, et al.
Published: (2025)
by: Wang, Peter, et al.
Published: (2025)
DSLR-CNN: Efficient CNN Acceleration using Digit-Serial Left-to-Right Arithmetic
by: Nisar, Malik Zohaib, et al.
Published: (2025)
by: Nisar, Malik Zohaib, et al.
Published: (2025)
A Time- and Energy-Efficient CNN with Dense Connections on Memristor-Based Chips
by: Zhou, Wenyong, et al.
Published: (2025)
by: Zhou, Wenyong, et al.
Published: (2025)
SkyByte: Architecting an Efficient Memory-Semantic CXL-based SSD with OS and Hardware Co-design
by: Zhang, Haoyang, et al.
Published: (2025)
by: Zhang, Haoyang, et al.
Published: (2025)
A Novel FPGA-based CNN Hardware Accelerator: Optimization for Convolutional Layers using Karatsuba Ofman Multiplier
by: Sarkar, Amit
Published: (2024)
by: Sarkar, Amit
Published: (2024)
FAME: FPGA Acceleration of Secure Matrix Multiplication with Homomorphic Encryption
by: Xu, Zhihan, et al.
Published: (2025)
by: Xu, Zhihan, et al.
Published: (2025)
Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design
by: Fu, Yonggan, et al.
Published: (2023)
by: Fu, Yonggan, et al.
Published: (2023)
In-Memory Computing Architecture for Efficient Hardware Security
by: Ajmi, Hala, et al.
Published: (2024)
by: Ajmi, Hala, et al.
Published: (2024)
APSQ: Additive Partial Sum Quantization with Algorithm-Hardware Co-Design
by: Tan, Yonghao, et al.
Published: (2025)
by: Tan, Yonghao, et al.
Published: (2025)
Marco: Configurable Graph-Based Task Solving and Multi-AI Agents Framework for Hardware Design
by: Ho, Chia-Tung, et al.
Published: (2025)
by: Ho, Chia-Tung, et al.
Published: (2025)
A Power-Efficient Hardware Implementation of L-Mul
by: Chen, Ruiqi, et al.
Published: (2024)
by: Chen, Ruiqi, et al.
Published: (2024)
An Efficient Sparse Hardware Accelerator for Spike-Driven Transformer
by: Li, Zhengke, et al.
Published: (2025)
by: Li, Zhengke, et al.
Published: (2025)
GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read Mapping
by: Eudine, Julien, et al.
Published: (2026)
by: Eudine, Julien, et al.
Published: (2026)
Similar Items
-
VTR: An Optimized Vision Transformer for SAR ATR Acceleration on FPGA
by: Wickramasinghe, Sachini, et al.
Published: (2024) -
Efficient and Accurate Graph Classification with Hyperdimensional Computing on FPGA
by: Arockiaraj, Jebacyril, et al.
Published: (2025) -
FAST-Prefill: FPGA Accelerated Sparse Attention for Long Context LLM Prefill
by: Jayanth, Rakshith, et al.
Published: (2026) -
Harmonia: Algorithm-Hardware Co-Design for Memory- and Compute-Efficient BFP-based LLM Inference
by: Wang, Xinyu, et al.
Published: (2026) -
CogSys: Efficient and Scalable Neurosymbolic Cognition System via Algorithm-Hardware Co-Design
by: Wan, Zishen, et al.
Published: (2025)