Saved in:
| Main Authors: | Huang, Zhuoxu, Fan, Zhenkun, Han, Jungong, Kittler, Josef |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2606.01604 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On Exploring PDE Modeling for Point Cloud Video Representation Learning
by: Huang, Zhuoxu, et al.
Published: (2024)
by: Huang, Zhuoxu, et al.
Published: (2024)
Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model
by: Huang, Zhuoxu, et al.
Published: (2025)
by: Huang, Zhuoxu, et al.
Published: (2025)
Pixel Sentence Representation Learning
by: Xiao, Chenghao, et al.
Published: (2024)
by: Xiao, Chenghao, et al.
Published: (2024)
MERGETUNE: Continued Fine-Tuning of Vision-Language Models
by: Wang, Wenqing, et al.
Published: (2026)
by: Wang, Wenqing, et al.
Published: (2026)
Single Image, Any Face: Generalisable 3D Face Generation
by: Wang, Wenqing, et al.
Published: (2024)
by: Wang, Wenqing, et al.
Published: (2024)
Dynamic Avatar-Scene Rendering from Human-centric Context
by: Wang, Wenqing, et al.
Published: (2025)
by: Wang, Wenqing, et al.
Published: (2025)
SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from Videos
by: Gao, Mingqi, et al.
Published: (2025)
by: Gao, Mingqi, et al.
Published: (2025)
Virtual Category Learning: A Semi-Supervised Learning Method for Dense Prediction with Extremely Limited Labels
by: Chen, Changrui, et al.
Published: (2023)
by: Chen, Changrui, et al.
Published: (2023)
Investigating Self-Supervised Methods for Label-Efficient Learning
by: Nandam, Srinivasa Rao, et al.
Published: (2024)
by: Nandam, Srinivasa Rao, et al.
Published: (2024)
SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality
by: Li, Sijie, et al.
Published: (2025)
by: Li, Sijie, et al.
Published: (2025)
Representation Learning for Point Cloud Understanding
by: Yan, Siming
Published: (2025)
by: Yan, Siming
Published: (2025)
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
by: Zhu, Haoyi, et al.
Published: (2023)
by: Zhu, Haoyi, et al.
Published: (2023)
MMDRFuse: Distilled Mini-Model with Dynamic Refresh for Multi-Modality Image Fusion
by: Deng, Yanglin, et al.
Published: (2024)
by: Deng, Yanglin, et al.
Published: (2024)
An Improved Graph Pooling Network for Skeleton-Based Action Recognition
by: Wu, Cong, et al.
Published: (2024)
by: Wu, Cong, et al.
Published: (2024)
Physics-Driven Local-Whole Elastic Deformation Modeling for Point Cloud Representation Learning
by: Chen, Zhongyu, et al.
Published: (2025)
by: Chen, Zhongyu, et al.
Published: (2025)
Intrinsic Image Decomposition Using Point Cloud Representation
by: Xing, Xiaoyan, et al.
Published: (2023)
by: Xing, Xiaoyan, et al.
Published: (2023)
THU-Warwick Submission for EPIC-KITCHEN Challenge 2025: Semi-Supervised Video Object Segmentation
by: Gao, Mingqi, et al.
Published: (2025)
by: Gao, Mingqi, et al.
Published: (2025)
Dynamic Subframe Splitting and Spatio-Temporal Motion Entangled Sparse Attention for RGB-E Tracking
by: Shao, Pengcheng, et al.
Published: (2024)
by: Shao, Pengcheng, et al.
Published: (2024)
MAGIC-Talk: Motion-aware Audio-Driven Talking Face Generation with Customizable Identity Control
by: Nazarieh, Fatemeh, et al.
Published: (2025)
by: Nazarieh, Fatemeh, et al.
Published: (2025)
KAN or MLP? Point Cloud Shows the Way Forward
by: Shi, Yan, et al.
Published: (2025)
by: Shi, Yan, et al.
Published: (2025)
A Tri-Modal Dataset and a Baseline System for Tracking Unmanned Aerial Vehicles
by: Xu, Tianyang, et al.
Published: (2025)
by: Xu, Tianyang, et al.
Published: (2025)
Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment
by: Chen, Jingkun, et al.
Published: (2026)
by: Chen, Jingkun, et al.
Published: (2026)
MF-MOS: A Motion-Focused Model for Moving Object Segmentation
by: Cheng, Jintao, et al.
Published: (2024)
by: Cheng, Jintao, et al.
Published: (2024)
Point2Vec for Self-Supervised Representation Learning on Point Clouds
by: Knaebel, Karim, et al.
Published: (2023)
by: Knaebel, Karim, et al.
Published: (2023)
Learning Progressive Adaptation for Multi-Modal Tracking
by: Wang, He, et al.
Published: (2026)
by: Wang, He, et al.
Published: (2026)
Multi-Paradigm Collaborative Adversarial Attack Against Multi-Modal Large Language Models
by: Li, Yuanbo, et al.
Published: (2026)
by: Li, Yuanbo, et al.
Published: (2026)
Towards Fusing Point Cloud and Visual Representations for Imitation Learning
by: Donat, Atalay, et al.
Published: (2025)
by: Donat, Atalay, et al.
Published: (2025)
Novel Class Discovery for Point Cloud Segmentation via Joint Learning of Causal Representation and Reasoning
by: Li, Yang, et al.
Published: (2025)
by: Li, Yang, et al.
Published: (2025)
On-the-fly Point Feature Representation for Point Clouds Analysis
by: Wang, Jiangyi, et al.
Published: (2024)
by: Wang, Jiangyi, et al.
Published: (2024)
T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning
by: Wei, Weijie, et al.
Published: (2023)
by: Wei, Weijie, et al.
Published: (2023)
A Unified Framework for Human-centric Point Cloud Video Understanding
by: Xu, Yiteng, et al.
Published: (2024)
by: Xu, Yiteng, et al.
Published: (2024)
QuoVLA: Quotient Space for Vision-Language-Action Models
by: Wang, Xuan, et al.
Published: (2026)
by: Wang, Xuan, et al.
Published: (2026)
Re-Prompting SAM 3 via Object Retrieval: 3rd of the 5th PVUW MOSE Track
by: Gao, Mingqi, et al.
Published: (2026)
by: Gao, Mingqi, et al.
Published: (2026)
WaveFace: Authentic Face Restoration with Efficient Frequency Recovery
by: Miao, Yunqi, et al.
Published: (2024)
by: Miao, Yunqi, et al.
Published: (2024)
1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
by: Gao, Mingqi, et al.
Published: (2024)
by: Gao, Mingqi, et al.
Published: (2024)
Point Cloud Mamba: Point Cloud Learning via State Space Model
by: Zhang, Tao, et al.
Published: (2024)
by: Zhang, Tao, et al.
Published: (2024)
Object Dynamics Modeling with Hierarchical Point Cloud-based Representations
by: Kim, Chanho, et al.
Published: (2024)
by: Kim, Chanho, et al.
Published: (2024)
Modality Prompts for Arbitrary Modality Salient Object Detection
by: Huang, Nianchang, et al.
Published: (2024)
by: Huang, Nianchang, et al.
Published: (2024)
Editing Physiological Signals in Videos Using Latent Representations
by: Zhou, Tianwen, et al.
Published: (2025)
by: Zhou, Tianwen, et al.
Published: (2025)
Advancements in Point Cloud Data Augmentation for Deep Learning: A Survey
by: Zhu, Qinfeng, et al.
Published: (2023)
by: Zhu, Qinfeng, et al.
Published: (2023)
Similar Items
-
On Exploring PDE Modeling for Point Cloud Video Representation Learning
by: Huang, Zhuoxu, et al.
Published: (2024) -
Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model
by: Huang, Zhuoxu, et al.
Published: (2025) -
Pixel Sentence Representation Learning
by: Xiao, Chenghao, et al.
Published: (2024) -
MERGETUNE: Continued Fine-Tuning of Vision-Language Models
by: Wang, Wenqing, et al.
Published: (2026) -
Single Image, Any Face: Generalisable 3D Face Generation
by: Wang, Wenqing, et al.
Published: (2024)