Saved in:
| Main Authors: | Li, Chengjian, Shu, Xiangbo, Cui, Qiongjie, Yao, Yazhou, Tang, Jinhui |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.17532 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Spatiotemporal-Untrammelled Mixture of Experts for Multi-Person Motion Prediction
by: Yin, Zheng, et al.
Published: (2025)
by: Yin, Zheng, et al.
Published: (2025)
OmniGaze: Reward-inspired Generalizable Gaze Estimation In The Wild
by: Qu, Hongyu, et al.
Published: (2025)
by: Qu, Hongyu, et al.
Published: (2025)
Multimodal Sense-Informed Prediction of 3D Human Motions
by: Lou, Zhenyu, et al.
Published: (2024)
by: Lou, Zhenyu, et al.
Published: (2024)
Plenodium: UnderWater 3D Scene Reconstruction with Plenoptic Medium Representation
by: Wu, Changguanng, et al.
Published: (2025)
by: Wu, Changguanng, et al.
Published: (2025)
Vision-centric Token Compression in Large Language Model
by: Xing, Ling, et al.
Published: (2025)
by: Xing, Ling, et al.
Published: (2025)
Locality-aware Cross-modal Correspondence Learning for Dense Audio-Visual Events Localization
by: Xing, Ling, et al.
Published: (2024)
by: Xing, Ling, et al.
Published: (2024)
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
by: Pei, Gensheng, et al.
Published: (2025)
by: Pei, Gensheng, et al.
Published: (2025)
Bilingual Text-to-Motion Generation: A New Benchmark and Baselines
by: Weng, Wanjiang, et al.
Published: (2026)
by: Weng, Wanjiang, et al.
Published: (2026)
Expressive Forecasting of 3D Whole-body Human Motions
by: Ding, Pengxiang, et al.
Published: (2023)
by: Ding, Pengxiang, et al.
Published: (2023)
Taming SAM3 in the Wild: A Concept Bank for Open-Vocabulary Segmentation
by: Pei, Gensheng, et al.
Published: (2026)
by: Pei, Gensheng, et al.
Published: (2026)
Learning 3D Representations for Spatial Intelligence from Unposed Multi-View Images
by: Zhou, Bo, et al.
Published: (2026)
by: Zhou, Bo, et al.
Published: (2026)
EventCrab: Harnessing Frame and Point Synergy for Event-based Action Recognition and Beyond
by: Cao, Meiqi, et al.
Published: (2024)
by: Cao, Meiqi, et al.
Published: (2024)
SpectMamba: Integrating Frequency and State Space Models for Enhanced Medical Image Detection
by: Wang, Yao, et al.
Published: (2025)
by: Wang, Yao, et al.
Published: (2025)
AdaFPP: Adapt-Focused Bi-Propagating Prototype Learning for Panoramic Activity Recognition
by: Cao, Meiqi, et al.
Published: (2024)
by: Cao, Meiqi, et al.
Published: (2024)
MambaVSR: Content-Aware Scanning State Space Model for Video Super-Resolution
by: He, Linfeng, et al.
Published: (2025)
by: He, Linfeng, et al.
Published: (2025)
PGP-DiffSR: Phase-Guided Progressive Pruning for Efficient Diffusion-based Image Super-Resolution
by: Yang, Zhongbao, et al.
Published: (2025)
by: Yang, Zhongbao, et al.
Published: (2025)
Spatio-temporal Decoupled Knowledge Compensator for Few-Shot Action Recognition
by: Qu, Hongyu, et al.
Published: (2026)
by: Qu, Hongyu, et al.
Published: (2026)
Beyond Quadratic: Linear-Time Change Detection with RWKV
by: Yang, Zhenyu, et al.
Published: (2026)
by: Yang, Zhenyu, et al.
Published: (2026)
PCA-Seg: Revisiting Cost Aggregation for Open-Vocabulary Semantic and Part Segmentation
by: Yin, Jianjian, et al.
Published: (2026)
by: Yin, Jianjian, et al.
Published: (2026)
Spatial Structure Constraints for Weakly Supervised Semantic Segmentation
by: Chen, Tao, et al.
Published: (2024)
by: Chen, Tao, et al.
Published: (2024)
Combating Noisy Labels through Fostering Self- and Neighbor-Consistency
by: Sun, Zeren, et al.
Published: (2026)
by: Sun, Zeren, et al.
Published: (2026)
Diff-MM: Exploring Pre-trained Text-to-Image Generation Model for Unified Multi-modal Object Tracking
by: Xuan, Shiyu, et al.
Published: (2025)
by: Xuan, Shiyu, et al.
Published: (2025)
MambaMOT: State-Space Model as Motion Predictor for Multi-Object Tracking
by: Huang, Hsiang-Wei, et al.
Published: (2024)
by: Huang, Hsiang-Wei, et al.
Published: (2024)
Motion Mamba: Efficient and Long Sequence Motion Generation
by: Zhang, Zeyu, et al.
Published: (2024)
by: Zhang, Zeyu, et al.
Published: (2024)
ASTRA: Let Arbitrary Subjects Transform in Video Editing
by: Shen, Fei, et al.
Published: (2025)
by: Shen, Fei, et al.
Published: (2025)
Dynamic in Static: Hybrid Visual Correspondence for Self-Supervised Video Object Segmentation
by: Pei, Gensheng, et al.
Published: (2024)
by: Pei, Gensheng, et al.
Published: (2024)
MambaVF: State Space Model for Efficient Video Fusion
by: Zhao, Zixiang, et al.
Published: (2026)
by: Zhao, Zixiang, et al.
Published: (2026)
MambaIR: A Simple Baseline for Image Restoration with State-Space Model
by: Guo, Hang, et al.
Published: (2024)
by: Guo, Hang, et al.
Published: (2024)
Mamba-Adaptor: State Space Model Adaptor for Visual Recognition
by: Xie, Fei, et al.
Published: (2025)
by: Xie, Fei, et al.
Published: (2025)
FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data
by: Xu, Binqian, et al.
Published: (2024)
by: Xu, Binqian, et al.
Published: (2024)
Efficient Visual State Space Model for Image Deblurring
by: Kong, Lingshun, et al.
Published: (2024)
by: Kong, Lingshun, et al.
Published: (2024)
PhysMamba: State Space Duality Model for Remote Physiological Measurement
by: Yan, Zhixin, et al.
Published: (2024)
by: Yan, Zhixin, et al.
Published: (2024)
COMOGen: A Controllable Text-to-3D Multi-object Generation Framework
by: Sun, Shaorong, et al.
Published: (2024)
by: Sun, Shaorong, et al.
Published: (2024)
SF-Mamba: Rethinking State Space Model for Vision
by: Yoshimura, Masakazu, et al.
Published: (2026)
by: Yoshimura, Masakazu, et al.
Published: (2026)
DeRainMamba: A Frequency-Aware State Space Model with Detail Enhancement for Image Deraining
by: Zhu, Zhiliang, et al.
Published: (2025)
by: Zhu, Zhiliang, et al.
Published: (2025)
Mamba-based Spatio-Frequency Motion Perception for Video Camouflaged Object Detection
by: Li, Xin, et al.
Published: (2025)
by: Li, Xin, et al.
Published: (2025)
VideoMamba: State Space Model for Efficient Video Understanding
by: Li, Kunchang, et al.
Published: (2024)
by: Li, Kunchang, et al.
Published: (2024)
Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion
by: Wang, Xinghan, et al.
Published: (2024)
by: Wang, Xinghan, et al.
Published: (2024)
KMM: Key Frame Mask Mamba for Extended Motion Generation
by: Zhang, Zeyu, et al.
Published: (2024)
by: Zhang, Zeyu, et al.
Published: (2024)
T2M Mamba: Motion Periodicity-Saliency Coupling Approach for Stable Text-Driven Motion Generation
by: Zhan, Xingzu, et al.
Published: (2026)
by: Zhan, Xingzu, et al.
Published: (2026)
Similar Items
-
Spatiotemporal-Untrammelled Mixture of Experts for Multi-Person Motion Prediction
by: Yin, Zheng, et al.
Published: (2025) -
OmniGaze: Reward-inspired Generalizable Gaze Estimation In The Wild
by: Qu, Hongyu, et al.
Published: (2025) -
Multimodal Sense-Informed Prediction of 3D Human Motions
by: Lou, Zhenyu, et al.
Published: (2024) -
Plenodium: UnderWater 3D Scene Reconstruction with Plenoptic Medium Representation
by: Wu, Changguanng, et al.
Published: (2025) -
Vision-centric Token Compression in Large Language Model
by: Xing, Ling, et al.
Published: (2025)