:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Chengjian, Shu, Xiangbo, Cui, Qiongjie, Yao, Yazhou, Tang, Jinhui
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2411.17532
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Spatiotemporal-Untrammelled Mixture of Experts for Multi-Person Motion Prediction
by: Yin, Zheng, et al.
Published: (2025)

OmniGaze: Reward-inspired Generalizable Gaze Estimation In The Wild
by: Qu, Hongyu, et al.
Published: (2025)

Multimodal Sense-Informed Prediction of 3D Human Motions
by: Lou, Zhenyu, et al.
Published: (2024)

Plenodium: UnderWater 3D Scene Reconstruction with Plenoptic Medium Representation
by: Wu, Changguanng, et al.
Published: (2025)

Vision-centric Token Compression in Large Language Model
by: Xing, Ling, et al.
Published: (2025)

Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
by: Xing, Ling, et al.
Published: (2024)

Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
by: Pei, Gensheng, et al.
Published: (2025)

Bilingual Text-to-Motion Generation: A New Benchmark and Baselines
by: Weng, Wanjiang, et al.
Published: (2026)

Expressive Forecasting of 3D Whole-body Human Motions
by: Ding, Pengxiang, et al.
Published: (2023)

Taming SAM3 in the Wild: A Concept Bank for Open-Vocabulary Segmentation
by: Pei, Gensheng, et al.
Published: (2026)

Learning 3D Representations for Spatial Intelligence from Unposed Multi-View Images
by: Zhou, Bo, et al.
Published: (2026)

EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond
by: Cao, Meiqi, et al.
Published: (2024)

SpectMamba: Integrating Frequency and State Space Models for Enhanced Medical Image Detection
by: Wang, Yao, et al.
Published: (2025)

AdaFPP: Adapt-Focused Bi-Propagating Prototype Learning for Panoramic Activity Recognition
by: Cao, Meiqi, et al.
Published: (2024)

MambaVSR: Content-Aware Scanning State Space Model for Video Super-Resolution
by: He, Linfeng, et al.
Published: (2025)

PGP-DiffSR: Phase-Guided Progressive Pruning for Efficient Diffusion-based Image Super-Resolution
by: Yang, Zhongbao, et al.
Published: (2025)

Spatio-temporal Decoupled Knowledge Compensator for Few-Shot Action Recognition
by: Qu, Hongyu, et al.
Published: (2026)

Beyond Quadratic: Linear-Time Change Detection with RWKV
by: Yang, Zhenyu, et al.
Published: (2026)

PCA-Seg: Revisiting Cost Aggregation for Open-Vocabulary Semantic and Part Segmentation
by: Yin, Jianjian, et al.
Published: (2026)

Spatial Structure Constraints for Weakly Supervised Semantic Segmentation
by: Chen, Tao, et al.
Published: (2024)

Combating Noisy Labels through Fostering Self- and Neighbor-Consistency
by: Sun, Zeren, et al.
Published: (2026)

Diff-MM: Exploring Pre-trained Text-to-Image Generation Model for Unified Multi-modal Object Tracking
by: Xuan, Shiyu, et al.
Published: (2025)

MambaMOT: State-Space Model as Motion Predictor for Multi-Object Tracking
by: Huang, Hsiang-Wei, et al.
Published: (2024)

Motion Mamba: Efficient and Long Sequence Motion Generation
by: Zhang, Zeyu, et al.
Published: (2024)

ASTRA: Let Arbitrary Subjects Transform in Video Editing
by: Shen, Fei, et al.
Published: (2025)

Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation
by: Pei, Gensheng, et al.
Published: (2024)

MambaVF: State Space Model for Efficient Video Fusion
by: Zhao, Zixiang, et al.
Published: (2026)

MambaIR: A Simple Baseline for Image Restoration with State-Space Model
by: Guo, Hang, et al.
Published: (2024)

Mamba-Adaptor: State Space Model Adaptor for Visual Recognition
by: Xie, Fei, et al.
Published: (2025)

FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data
by: Xu, Binqian, et al.
Published: (2024)

Efficient Visual State Space Model for Image Deblurring
by: Kong, Lingshun, et al.
Published: (2024)

PhysMamba: State Space Duality Model for Remote Physiological Measurement
by: Yan, Zhixin, et al.
Published: (2024)

COMOGen: A Controllable Text-to-3D Multi-object Generation Framework
by: Sun, Shaorong, et al.
Published: (2024)

SF-Mamba: Rethinking State Space Model for Vision
by: Yoshimura, Masakazu, et al.
Published: (2026)

DeRainMamba: A Frequency-Aware State Space Model with Detail Enhancement for Image Deraining
by: Zhu, Zhiliang, et al.
Published: (2025)

Mamba-based Spatio-Frequency Motion Perception for Video Camouflaged Object Detection
by: Li, Xin, et al.
Published: (2025)

VideoMamba: State Space Model for Efficient Video Understanding
by: Li, Kunchang, et al.
Published: (2024)

Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion
by: Wang, Xinghan, et al.
Published: (2024)

KMM: Key Frame Mask Mamba for Extended Motion Generation
by: Zhang, Zeyu, et al.
Published: (2024)

T2M Mamba: Motion Periodicity-Saliency Coupling Approach for Stable Text-Driven Motion Generation
by: Zhan, Xingzu, et al.
Published: (2026)