محفوظ في:
التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: Ge, Chengjie, Fu, Xueyang, He, Peng, Wang, Kunyu, Cao, Chengzhi, Zha, Zheng-Jun
التنسيق: Preprint
منشور في: 2025
الموضوعات:
الوصول للمادة أونلاين:https://arxiv.org/abs/2503.19721
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
_version_ 1866909560225136640
author Ge, Chengjie
Fu, Xueyang
He, Peng
Wang, Kunyu
Cao, Chengzhi
Zha, Zheng-Jun
author_facet Ge, Chengjie
Fu, Xueyang
He, Peng
Wang, Kunyu
Cao, Chengzhi
Zha, Zheng-Jun
contents Leveraging its robust linear global modeling capability, Mamba has notably excelled in computer vision. Despite its success, existing Mamba-based vision models have overlooked the nuances of event-driven tasks, especially in video reconstruction. Event-based video reconstruction (EBVR) demands spatial translation invariance and close attention to local event relationships in the spatio-temporal domain. Unfortunately, conventional Mamba algorithms apply static window partitions and standard reshape scanning methods, leading to significant losses in local connectivity. To overcome these limitations, we introduce EventMamba--a specialized model designed for EBVR tasks. EventMamba innovates by incorporating random window offset (RWO) in the spatial domain, moving away from the restrictive fixed partitioning. Additionally, it features a new consistent traversal serialization approach in the spatio-temporal domain, which maintains the proximity of adjacent events both spatially and temporally. These enhancements enable EventMamba to retain Mamba's robust modeling capabilities while significantly preserving the spatio-temporal locality of event data. Comprehensive testing on multiple datasets shows that EventMamba markedly enhances video reconstruction, drastically improving computation speed while delivering superior visual quality compared to Transformer-based methods.
format Preprint
id arxiv_https___arxiv_org_abs_2503_19721
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction
Ge, Chengjie
Fu, Xueyang
He, Peng
Wang, Kunyu
Cao, Chengzhi
Zha, Zheng-Jun
Computer Vision and Pattern Recognition
Leveraging its robust linear global modeling capability, Mamba has notably excelled in computer vision. Despite its success, existing Mamba-based vision models have overlooked the nuances of event-driven tasks, especially in video reconstruction. Event-based video reconstruction (EBVR) demands spatial translation invariance and close attention to local event relationships in the spatio-temporal domain. Unfortunately, conventional Mamba algorithms apply static window partitions and standard reshape scanning methods, leading to significant losses in local connectivity. To overcome these limitations, we introduce EventMamba--a specialized model designed for EBVR tasks. EventMamba innovates by incorporating random window offset (RWO) in the spatial domain, moving away from the restrictive fixed partitioning. Additionally, it features a new consistent traversal serialization approach in the spatio-temporal domain, which maintains the proximity of adjacent events both spatially and temporally. These enhancements enable EventMamba to retain Mamba's robust modeling capabilities while significantly preserving the spatio-temporal locality of event data. Comprehensive testing on multiple datasets shows that EventMamba markedly enhances video reconstruction, drastically improving computation speed while delivering superior visual quality compared to Transformer-based methods.
title EventMamba: Enhancing Spatio-Temporal Locality with State Space Models for Event-Based Video Reconstruction
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2503.19721