Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yoon, Dongsik, Kim, Jongeun, Lee, Dayeon
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Performance
Online Access:	https://arxiv.org/abs/2602.10818
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910019005448192
author	Yoon, Dongsik Kim, Jongeun Lee, Dayeon
author_facet	Yoon, Dongsik Kim, Jongeun Lee, Dayeon
contents	Action recognition on edge devices poses stringent constraints on latency, memory, storage, and power consumption. While auxiliary modalities such as skeleton and depth information can enhance recognition performance, they often require additional sensors or computationally expensive pose-estimation pipelines, limiting practicality for edge use. In this work, we propose a compact RGB-only network tailored for efficient on-device inference. Our approach builds upon an X3D-style backbone augmented with Temporal Shift, and further introduces selective temporal adaptation and parameter-free attention. Extensive experiments on the NTU RGB+D 60 and 120 benchmarks demonstrate a strong accuracy-efficiency balance. Moreover, deployment-level profiling on the Jetson Orin Nano verifies a smaller on-device footprint and practical resource utilization compared to existing RGB-based action recognition techniques.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_10818
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Resource-Efficient RGB-Only Action Recognition for Edge Deployment Yoon, Dongsik Kim, Jongeun Lee, Dayeon Computer Vision and Pattern Recognition Performance Action recognition on edge devices poses stringent constraints on latency, memory, storage, and power consumption. While auxiliary modalities such as skeleton and depth information can enhance recognition performance, they often require additional sensors or computationally expensive pose-estimation pipelines, limiting practicality for edge use. In this work, we propose a compact RGB-only network tailored for efficient on-device inference. Our approach builds upon an X3D-style backbone augmented with Temporal Shift, and further introduces selective temporal adaptation and parameter-free attention. Extensive experiments on the NTU RGB+D 60 and 120 benchmarks demonstrate a strong accuracy-efficiency balance. Moreover, deployment-level profiling on the Jetson Orin Nano verifies a smaller on-device footprint and practical resource utilization compared to existing RGB-based action recognition techniques.
title	Resource-Efficient RGB-Only Action Recognition for Edge Deployment
topic	Computer Vision and Pattern Recognition Performance
url	https://arxiv.org/abs/2602.10818

Similar Items