Saved in:
| Main Authors: | Erhardt, Jack, Li, Ziang, Pinkham, Reid, Berkovich, Andrew, Zhang, Zhengya |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.09528 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DPU or GPU for Accelerating Neural Networks Inference -- Why not both? Split CNN Inference
by: Oztas, Ali Emre, et al.
Published: (2026)
by: Oztas, Ali Emre, et al.
Published: (2026)
Vision Transformer Computation and Resilience for Dynamic Inference
by: Sreedhar, Kavya, et al.
Published: (2022)
by: Sreedhar, Kavya, et al.
Published: (2022)
Dedicated Inference Engine and Binary-Weight Neural Networks for Lightweight Instance Segmentation
by: Chen, Tse-Wei, et al.
Published: (2025)
by: Chen, Tse-Wei, et al.
Published: (2025)
Physically Grounded Monocular Depth via Nanophotonic Wavefront Prompting
by: Li, Bingxuan, et al.
Published: (2025)
by: Li, Bingxuan, et al.
Published: (2025)
Using GUI Agent for Electronic Design Automation
by: Li, Chunyi, et al.
Published: (2025)
by: Li, Chunyi, et al.
Published: (2025)
Identifying Unnecessary 3D Gaussians using Clustering for Fast Rendering of 3D Gaussian Splatting
by: Jo, Joongho, et al.
Published: (2024)
by: Jo, Joongho, et al.
Published: (2024)
ViM-Q: Scalable Algorithm-Hardware Co-Design for Vision Mamba Model Inference on FPGA
by: Lyu, Shengzhe, et al.
Published: (2026)
by: Lyu, Shengzhe, et al.
Published: (2026)
Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design
by: Fu, Yonggan, et al.
Published: (2023)
by: Fu, Yonggan, et al.
Published: (2023)
Ternary-Input Binary-Weight CNN Accelerator Design for Miniature Object Classification System with Query-Driven Spatial DVS
by: Li, Yuyang, et al.
Published: (2025)
by: Li, Yuyang, et al.
Published: (2025)
Neo: Real-Time On-Device 3D Gaussian Splatting with Reuse-and-Update Sorting Acceleration
by: Oh, Changhun, et al.
Published: (2025)
by: Oh, Changhun, et al.
Published: (2025)
Co-designing a Sub-millisecond Latency Event-based Eye Tracking System with Submanifold Sparse CNN
by: Zhang, Baoheng, et al.
Published: (2024)
by: Zhang, Baoheng, et al.
Published: (2024)
GS-TG: 3D Gaussian Splatting Accelerator with Tile Grouping for Reducing Redundant Sorting while Preserving Rasterization Efficiency
by: Jo, Joongho, et al.
Published: (2025)
by: Jo, Joongho, et al.
Published: (2025)
BILLNET: A Binarized Conv3D-LSTM Network with Logic-gated residual architecture for hardware-efficient video inference
by: Nguyen, Van Thien, et al.
Published: (2025)
by: Nguyen, Van Thien, et al.
Published: (2025)
ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design
by: You, Haoran, et al.
Published: (2022)
by: You, Haoran, et al.
Published: (2022)
CHOSEN: Compilation to Hardware Optimization Stack for Efficient Vision Transformer Inference
by: Sadeghi, Mohammad Erfan, et al.
Published: (2024)
by: Sadeghi, Mohammad Erfan, et al.
Published: (2024)
GRTX: Efficient Ray Tracing for 3D Gaussian-Based Rendering
by: Lee, Junseo, et al.
Published: (2026)
by: Lee, Junseo, et al.
Published: (2026)
TinyIceNet: Low-Power SAR Sea Ice Segmentation for On-Board FPGA Inference
by: Koutayni, Mhd Rashed Al, et al.
Published: (2026)
by: Koutayni, Mhd Rashed Al, et al.
Published: (2026)
NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural Network Inference in Low-Voltage Regimes
by: Sun, Hao-Lun, et al.
Published: (2023)
by: Sun, Hao-Lun, et al.
Published: (2023)
AppSign: Multi-level Approximate Computing for Real-Time Traffic Sign Recognition in Autonomous Vehicles
by: Omidian, Fatemeh, et al.
Published: (2024)
by: Omidian, Fatemeh, et al.
Published: (2024)
Evolving Layer-Specific Scalar Functions for Hardware-Aware Transformer Adaptation
by: Carrigg, Kieran, et al.
Published: (2026)
by: Carrigg, Kieran, et al.
Published: (2026)
ORBIS: Output-Guided Token Reduction with Distribution-Aware Matching for Video Diffusion Acceleration
by: Lee, Hangyeol, et al.
Published: (2026)
by: Lee, Hangyeol, et al.
Published: (2026)
Primitive-Driven Acceleration of Hyperdimensional Computing for Real-Time Image Classification
by: Parikh, Dhruv, et al.
Published: (2026)
by: Parikh, Dhruv, et al.
Published: (2026)
Vision Transformers on the Edge: A Comprehensive Survey of Model Compression and Acceleration Strategies
by: Saha, Shaibal, et al.
Published: (2025)
by: Saha, Shaibal, et al.
Published: (2025)
TIMERIPPLE: Accelerating vDiTs by Understanding the Spatio-Temporal Correlations in Latent Space
by: Miao, Wenxuan, et al.
Published: (2025)
by: Miao, Wenxuan, et al.
Published: (2025)
MVQ:Towards Efficient DNN Compression and Acceleration with Masked Vector Quantization
by: Li, Shuaiting, et al.
Published: (2024)
by: Li, Shuaiting, et al.
Published: (2024)
Real-Time Object Detection and Classification using YOLO for Edge FPGAs
by: Amin, Rashed Al, et al.
Published: (2025)
by: Amin, Rashed Al, et al.
Published: (2025)
FG-Attn: Leveraging Fine-Grained Sparsity In Diffusion Transformers
by: Durvasula, Sankeerth, et al.
Published: (2025)
by: Durvasula, Sankeerth, et al.
Published: (2025)
A Parameterizable Convolution Accelerator for Embedded Deep Learning Applications
by: Mousouliotis, Panagiotis, et al.
Published: (2026)
by: Mousouliotis, Panagiotis, et al.
Published: (2026)
Efficient stereo matching on embedded GPUs with zero-means cross correlation
by: Chang, Qiong, et al.
Published: (2022)
by: Chang, Qiong, et al.
Published: (2022)
SpNeRF: Memory Efficient Sparse Volumetric Neural Rendering Accelerator for Edge Devices
by: Zhang, Yipu, et al.
Published: (2025)
by: Zhang, Yipu, et al.
Published: (2025)
LoRA-Edge: Tensor-Train-Assisted LoRA for Practical CNN Fine-Tuning on Edge Devices
by: Kwak, Hyunseok, et al.
Published: (2025)
by: Kwak, Hyunseok, et al.
Published: (2025)
SF-MMCN: Low-Power Sever Flow Multi-Mode Diffusion Model Accelerator
by: Hsu, Huan-Ke, et al.
Published: (2024)
by: Hsu, Huan-Ke, et al.
Published: (2024)
hARMS: A Hardware Acceleration Architecture for Real-Time Event-Based Optical Flow
by: Stumpp, Daniel C., et al.
Published: (2021)
by: Stumpp, Daniel C., et al.
Published: (2021)
On-Orbit Real-Time Wildfire Detection Under On-Board Constraints
by: Rötzer, Matthias, et al.
Published: (2026)
by: Rötzer, Matthias, et al.
Published: (2026)
CAMO: Correlation-Aware Mask Optimization with Modulated Reinforcement Learning
by: Liang, Xiaoxiao, et al.
Published: (2024)
by: Liang, Xiaoxiao, et al.
Published: (2024)
Accelerating AI and Computer Vision for Satellite Pose Estimation on the Intel Myriad X Embedded SoC
by: Leon, Vasileios, et al.
Published: (2024)
by: Leon, Vasileios, et al.
Published: (2024)
Smaller, Faster, Cheaper: Architectural Designs for Efficient Machine Learning
by: Walton, Steven
Published: (2025)
by: Walton, Steven
Published: (2025)
QUILL: An Algorithm-Architecture Co-Design for Cache-Local Deformable Attention
by: Oh, Hyunwoo, et al.
Published: (2025)
by: Oh, Hyunwoo, et al.
Published: (2025)
HARFLOW3D: A Latency-Oriented 3D-CNN Accelerator Toolflow for HAR on FPGA Devices
by: Toupas, Petros, et al.
Published: (2023)
by: Toupas, Petros, et al.
Published: (2023)
ASC: Adaptive Scale Feature Map Compression for Deep Neural Network
by: Yao, Yuan, et al.
Published: (2023)
by: Yao, Yuan, et al.
Published: (2023)
Similar Items
-
DPU or GPU for Accelerating Neural Networks Inference -- Why not both? Split CNN Inference
by: Oztas, Ali Emre, et al.
Published: (2026) -
Vision Transformer Computation and Resilience for Dynamic Inference
by: Sreedhar, Kavya, et al.
Published: (2022) -
Dedicated Inference Engine and Binary-Weight Neural Networks for Lightweight Instance Segmentation
by: Chen, Tse-Wei, et al.
Published: (2025) -
Physically Grounded Monocular Depth via Nanophotonic Wavefront Prompting
by: Li, Bingxuan, et al.
Published: (2025) -
Using GUI Agent for Electronic Design Automation
by: Li, Chunyi, et al.
Published: (2025)