Saved in:
| Main Authors: | Toupas, Petros, Yu, Zhewen, Bouganis, Christos-Savvas, Tzovaras, Dimitrios |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.18921 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
fpgaHART: A toolflow for throughput-oriented acceleration of 3D CNNs for HAR onto FPGAs
by: Toupas, Petros, et al.
Published: (2023)
by: Toupas, Petros, et al.
Published: (2023)
FMM-X3D: FPGA-based modeling and mapping of X3D for Human Action Recognition
by: Toupas, Petros, et al.
Published: (2023)
by: Toupas, Petros, et al.
Published: (2023)
HARFLOW3D: A Latency-Oriented 3D-CNN Accelerator Toolflow for HAR on FPGA Devices
by: Toupas, Petros, et al.
Published: (2023)
by: Toupas, Petros, et al.
Published: (2023)
ATHEENA: A Toolflow for Hardware Early-Exit Network Automation
by: Biggs, Benjamin, et al.
Published: (2023)
by: Biggs, Benjamin, et al.
Published: (2023)
ITERA-LLM: Boosting Sub-8-Bit Large Language Model Inference via Iterative Tensor Decomposition
by: Zheng, Keran, et al.
Published: (2025)
by: Zheng, Keran, et al.
Published: (2025)
HASS: Hardware-Aware Sparsity Search for Dataflow DNN Accelerator
by: Yu, Zhewen, et al.
Published: (2024)
by: Yu, Zhewen, et al.
Published: (2024)
Shedding the Bits: Pushing the Boundaries of Quantization with Minifloats on FPGAs
by: Aggarwal, Shivam, et al.
Published: (2023)
by: Aggarwal, Shivam, et al.
Published: (2023)
A Dataflow Compiler for Efficient LLM Inference using Custom Microscaling Formats
by: Cheng, Jianyi, et al.
Published: (2023)
by: Cheng, Jianyi, et al.
Published: (2023)
From Detection to Action Recognition: An Edge-Based Pipeline for Robot Human Perception
by: Toupas, Petros, et al.
Published: (2023)
by: Toupas, Petros, et al.
Published: (2023)
QOC: Quantum On-Chip Training with Parameter Shift and Gradient Pruning
by: Wang, Hanrui, et al.
Published: (2022)
by: Wang, Hanrui, et al.
Published: (2022)
Real-Time Object Detection and Classification using YOLO for Edge FPGAs
by: Amin, Rashed Al, et al.
Published: (2025)
by: Amin, Rashed Al, et al.
Published: (2025)
NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes
by: Sun, Hao-Lun, et al.
Published: (2023)
by: Sun, Hao-Lun, et al.
Published: (2023)
ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
by: You, Haoran, et al.
Published: (2022)
by: You, Haoran, et al.
Published: (2022)
Dynamic Tsetlin Machine Accelerators for On-Chip Training at the Edge using FPGAs
by: Mao, Gang, et al.
Published: (2025)
by: Mao, Gang, et al.
Published: (2025)
ViM-Q: Scalable Algorithm-Hardware Co-Design for Vision Mamba Model Inference on FPGA
by: Lyu, Shengzhe, et al.
Published: (2026)
by: Lyu, Shengzhe, et al.
Published: (2026)
Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation
by: Swaminathan, Tushar Prasanna, et al.
Published: (2024)
by: Swaminathan, Tushar Prasanna, et al.
Published: (2024)
Energy Efficient Exact and Approximate Systolic Array Architecture for Matrix Multiplication
by: Jaswal, Pragun, et al.
Published: (2025)
by: Jaswal, Pragun, et al.
Published: (2025)
Stella Nera: A Differentiable Maddness-Based Hardware Accelerator for Efficient Approximate Matrix Multiplication
by: Schönleber, Jannis, et al.
Published: (2023)
by: Schönleber, Jannis, et al.
Published: (2023)
AHCQ-SAM: Toward Accurate and Hardware-Compatible Post-Training Segment Anything Model Quantization
by: Zhang, Wenlun, et al.
Published: (2025)
by: Zhang, Wenlun, et al.
Published: (2025)
Ditto: Accelerating Diffusion Model via Temporal Value Similarity
by: Kim, Sungbin, et al.
Published: (2025)
by: Kim, Sungbin, et al.
Published: (2025)
Smaller, Faster, Cheaper: Architectural Designs for Efficient Machine Learning
by: Walton, Steven
Published: (2025)
by: Walton, Steven
Published: (2025)
CRISP: Hybrid Structured Sparsity for Class-aware Model Pruning
by: Aggarwal, Shivam, et al.
Published: (2023)
by: Aggarwal, Shivam, et al.
Published: (2023)
Performance Analysis of Edge and In-Sensor AI Processors: A Comparative Review
by: Capogrosso, Luigi, et al.
Published: (2026)
by: Capogrosso, Luigi, et al.
Published: (2026)
QUILL: An Algorithm-Architecture Co-Design for Cache-Local Deformable Attention
by: Oh, Hyunwoo, et al.
Published: (2025)
by: Oh, Hyunwoo, et al.
Published: (2025)
Neuro-Channel Networks: A Multiplication-Free Architecture by Biological Signal Transmission
by: Mete, Emrah, et al.
Published: (2026)
by: Mete, Emrah, et al.
Published: (2026)
Mix-and-Match Pruning: Globally Guided Layer-Wise Sparsification of DNNs
by: Monachan, Danial, et al.
Published: (2026)
by: Monachan, Danial, et al.
Published: (2026)
TsetlinWiSARD: On-Chip Training of Weightless Neural Networks using Tsetlin Automata on FPGAs
by: Duan, Shengyu, et al.
Published: (2026)
by: Duan, Shengyu, et al.
Published: (2026)
Efficient Message Passing Architecture for GCN Training on HBM-based FPGAs with Orthogonal Topology On-Chip Networks
by: Wu, Qizhe, et al.
Published: (2024)
by: Wu, Qizhe, et al.
Published: (2024)
Accelerating 3D Gaussian Splatting with Neural Sorting and Axis-Oriented Rasterization
by: Wang, Zhican, et al.
Published: (2025)
by: Wang, Zhican, et al.
Published: (2025)
Neural Architecture Search of Hybrid Models for NPU-CIM Heterogeneous AR/VR Devices
by: Zhao, Yiwei, et al.
Published: (2024)
by: Zhao, Yiwei, et al.
Published: (2024)
RaGNNarok: A Light-Weight Graph Neural Network for Enhancing Radar Point Clouds on Unmanned Ground Vehicles
by: Hunt, David, et al.
Published: (2025)
by: Hunt, David, et al.
Published: (2025)
On Latency Predictors for Neural Architecture Search
by: Akhauri, Yash, et al.
Published: (2024)
by: Akhauri, Yash, et al.
Published: (2024)
Vision-Based Perception for Autonomous Vehicles in Off-Road Environment Using Deep Learning
by: Neto, Nelson Alves Ferreira
Published: (2025)
by: Neto, Nelson Alves Ferreira
Published: (2025)
Accelerating AI and Computer Vision for Satellite Pose Estimation on the Intel Myriad X Embedded SoC
by: Leon, Vasileios, et al.
Published: (2024)
by: Leon, Vasileios, et al.
Published: (2024)
Real-World Deployment of a Lane Change Prediction Architecture Based on Knowledge Graph Embeddings and Bayesian Inference
by: Manzour, M., et al.
Published: (2025)
by: Manzour, M., et al.
Published: (2025)
TSLA: A Task-Specific Learning Adaptation for Semantic Segmentation on Autonomous Vehicles Platform
by: Liu, Jun, et al.
Published: (2025)
by: Liu, Jun, et al.
Published: (2025)
SQ-DM: Accelerating Diffusion Models with Aggressive Quantization and Temporal Sparsity
by: Fan, Zichen, et al.
Published: (2025)
by: Fan, Zichen, et al.
Published: (2025)
SageAttention2++: A More Efficient Implementation of SageAttention2
by: Zhang, Jintao, et al.
Published: (2025)
by: Zhang, Jintao, et al.
Published: (2025)
FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries
by: Jiang, Yuqi, et al.
Published: (2024)
by: Jiang, Yuqi, et al.
Published: (2024)
Real-Time Semantic Segmentation of Aerial Images Using an Embedded U-Net: A Comparison of CPU, GPU, and FPGA Workflows
by: Posso, Julien, et al.
Published: (2025)
by: Posso, Julien, et al.
Published: (2025)
Similar Items
-
fpgaHART: A toolflow for throughput-oriented acceleration of 3D CNNs for HAR onto FPGAs
by: Toupas, Petros, et al.
Published: (2023) -
FMM-X3D: FPGA-based modeling and mapping of X3D for Human Action Recognition
by: Toupas, Petros, et al.
Published: (2023) -
HARFLOW3D: A Latency-Oriented 3D-CNN Accelerator Toolflow for HAR on FPGA Devices
by: Toupas, Petros, et al.
Published: (2023) -
ATHEENA: A Toolflow for Hardware Early-Exit Network Automation
by: Biggs, Benjamin, et al.
Published: (2023) -
ITERA-LLM: Boosting Sub-8-Bit Large Language Model Inference via Iterative Tensor Decomposition
by: Zheng, Keran, et al.
Published: (2025)