Saved in:
Bibliographic Details
Main Authors: Krishna, Adithya, Debnath, Sohan, Srivatsav, Madhuvanthi, van Schaik, André, Mehendale, Mahesh, Thakur, Chetan Singh
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.06996
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866918133266120704
author Krishna, Adithya
Debnath, Sohan
Srivatsav, Madhuvanthi
van Schaik, André
Mehendale, Mahesh
Thakur, Chetan Singh
author_facet Krishna, Adithya
Debnath, Sohan
Srivatsav, Madhuvanthi
van Schaik, André
Mehendale, Mahesh
Thakur, Chetan Singh
contents High-quality, multi-channel neural recording is indispensable for neuroscience research and clinical applications. Large-scale brain recordings often produce vast amounts of data that must be wirelessly transmitted for subsequent offline analysis and decoding, especially in brain-computer interfaces (BCIs) utilizing high-density intracortical recordings with hundreds or thousands of electrodes. However, transmitting raw neural data presents significant challenges due to limited communication bandwidth and resultant excessive heating. To address this challenge, we propose a neural signal compression scheme utilizing Convolutional Autoencoders (CAEs), which achieves a compression ratio of up to 150 for compressing local field potentials (LFPs). The CAE encoder section is implemented on RAMAN, an energy-efficient tinyML accelerator designed for edge computing. RAMAN leverages sparsity in activation and weights through zero skipping, gating, and weight compression techniques. Additionally, we employ hardware-software co-optimization by pruning the CAE encoder model parameters using a hardware-aware balanced stochastic pruning strategy, resolving workload imbalance issues and eliminating indexing overhead to reduce parameter storage requirements by up to 32.4%. Post layout simulation shows that the RAMAN encoder can be implemented in a TSMC 65-nm CMOS process, occupying a core area of 0.0187 mm2 per channel. Operating at a clock frequency of 2 MHz and a supply voltage of 1.2 V, the estimated power consumption is 15.1 uW per channel for the proposed DS-CAE1 model. For functional validation, the RAMAN encoder was also deployed on an Efinix Ti60 FPGA, utilizing 37.3k LUTs and 8.6k flip-flops. The compressed neural data from RAMAN is reconstructed offline with SNDR of 22.6 dB and 27.4 dB, along with R2 scores of 0.81 and 0.94, respectively, evaluated on two monkey neural recordings.
format Preprint
id arxiv_https___arxiv_org_abs_2504_06996
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Neural Signal Compression using RAMAN tinyML Accelerator for BCI Applications
Krishna, Adithya
Debnath, Sohan
Srivatsav, Madhuvanthi
van Schaik, André
Mehendale, Mahesh
Thakur, Chetan Singh
Hardware Architecture
Human-Computer Interaction
Machine Learning
High-quality, multi-channel neural recording is indispensable for neuroscience research and clinical applications. Large-scale brain recordings often produce vast amounts of data that must be wirelessly transmitted for subsequent offline analysis and decoding, especially in brain-computer interfaces (BCIs) utilizing high-density intracortical recordings with hundreds or thousands of electrodes. However, transmitting raw neural data presents significant challenges due to limited communication bandwidth and resultant excessive heating. To address this challenge, we propose a neural signal compression scheme utilizing Convolutional Autoencoders (CAEs), which achieves a compression ratio of up to 150 for compressing local field potentials (LFPs). The CAE encoder section is implemented on RAMAN, an energy-efficient tinyML accelerator designed for edge computing. RAMAN leverages sparsity in activation and weights through zero skipping, gating, and weight compression techniques. Additionally, we employ hardware-software co-optimization by pruning the CAE encoder model parameters using a hardware-aware balanced stochastic pruning strategy, resolving workload imbalance issues and eliminating indexing overhead to reduce parameter storage requirements by up to 32.4%. Post layout simulation shows that the RAMAN encoder can be implemented in a TSMC 65-nm CMOS process, occupying a core area of 0.0187 mm2 per channel. Operating at a clock frequency of 2 MHz and a supply voltage of 1.2 V, the estimated power consumption is 15.1 uW per channel for the proposed DS-CAE1 model. For functional validation, the RAMAN encoder was also deployed on an Efinix Ti60 FPGA, utilizing 37.3k LUTs and 8.6k flip-flops. The compressed neural data from RAMAN is reconstructed offline with SNDR of 22.6 dB and 27.4 dB, along with R2 scores of 0.81 and 0.94, respectively, evaluated on two monkey neural recordings.
title Neural Signal Compression using RAMAN tinyML Accelerator for BCI Applications
topic Hardware Architecture
Human-Computer Interaction
Machine Learning
url https://arxiv.org/abs/2504.06996