عرض للأخصائي: :: Library Catalog

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Ge, Chengjie, Fu, Xueyang, He, Peng, Wang, Kunyu, Cao, Chengzhi, Zha, Zheng-Jun
التنسيق:	Preprint
منشور في:	2025
الموضوعات:	Computer Vision and Pattern Recognition
الوصول للمادة أونلاين:	https://arxiv.org/abs/2503.19721
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

_version_	1866909560225136640
author	Ge, Chengjie Fu, Xueyang He, Peng Wang, Kunyu Cao, Chengzhi Zha, Zheng-Jun
author_facet	Ge, Chengjie Fu, Xueyang He, Peng Wang, Kunyu Cao, Chengzhi Zha, Zheng-Jun
contents	Leveraging its robust linear global modeling capability, Mamba has notably excelled in computer vision. Despite its success, existing Mamba-based vision models have overlooked the nuances of event-driven tasks, especially in video reconstruction. Event-based video reconstruction (EBVR) demands spatial translation invariance and close attention to local event relationships in the spatio-temporal domain. Unfortunately, conventional Mamba algorithms apply static window partitions and standard reshape scanning methods, leading to significant losses in local connectivity. To overcome these limitations, we introduce EventMamba--a specialized model designed for EBVR tasks. EventMamba innovates by incorporating random window offset (RWO) in the spatial domain, moving away from the restrictive fixed partitioning. Additionally, it features a new consistent traversal serialization approach in the spatio-temporal domain, which maintains the proximity of adjacent events both spatially and temporally. These enhancements enable EventMamba to retain Mamba's robust modeling capabilities while significantly preserving the spatio-temporal locality of event data. Comprehensive testing on multiple datasets shows that EventMamba markedly enhances video reconstruction, drastically improving computation speed while delivering superior visual quality compared to Transformer-based methods.
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_19721
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction Ge, Chengjie Fu, Xueyang He, Peng Wang, Kunyu Cao, Chengzhi Zha, Zheng-Jun Computer Vision and Pattern Recognition Leveraging its robust linear global modeling capability, Mamba has notably excelled in computer vision. Despite its success, existing Mamba-based vision models have overlooked the nuances of event-driven tasks, especially in video reconstruction. Event-based video reconstruction (EBVR) demands spatial translation invariance and close attention to local event relationships in the spatio-temporal domain. Unfortunately, conventional Mamba algorithms apply static window partitions and standard reshape scanning methods, leading to significant losses in local connectivity. To overcome these limitations, we introduce EventMamba--a specialized model designed for EBVR tasks. EventMamba innovates by incorporating random window offset (RWO) in the spatial domain, moving away from the restrictive fixed partitioning. Additionally, it features a new consistent traversal serialization approach in the spatio-temporal domain, which maintains the proximity of adjacent events both spatially and temporally. These enhancements enable EventMamba to retain Mamba's robust modeling capabilities while significantly preserving the spatio-temporal locality of event data. Comprehensive testing on multiple datasets shows that EventMamba markedly enhances video reconstruction, drastically improving computation speed while delivering superior visual quality compared to Transformer-based methods.
title	EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2503.19721

مواد مشابهة