:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Huang, Zhuoxu, Fan, Zhenkun, Han, Jungong, Kittler, Josef
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2606.01604
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

On Exploring PDE Modeling for Point Cloud Video Representation Learning
by: Huang, Zhuoxu, et al.
Published: (2024)

Point Linguist Model: Segment Any Object via Bridged Large 3D-Language Model
by: Huang, Zhuoxu, et al.
Published: (2025)

Pixel Sentence Representation Learning
by: Xiao, Chenghao, et al.
Published: (2024)

MERGETUNE: Continued Fine-Tuning of Vision-Language Models
by: Wang, Wenqing, et al.
Published: (2026)

Single Image, Any Face: Generalisable 3D Face Generation
by: Wang, Wenqing, et al.
Published: (2024)

Dynamic Avatar-Scene Rendering from Human-centric Context
by: Wang, Wenqing, et al.
Published: (2025)

SAM-Body4D: Training-Free 4D Human Body Mesh Recovery from Videos
by: Gao, Mingqi, et al.
Published: (2025)

Virtual Category Learning: A Semi-Supervised Learning Method for Dense Prediction with Extremely Limited Labels
by: Chen, Changrui, et al.
Published: (2023)

Investigating Self-Supervised Methods for Label-Efficient Learning
by: Nandam, Srinivasa Rao, et al.
Published: (2024)

SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality
by: Li, Sijie, et al.
Published: (2025)

Representation Learning for Point Cloud Understanding
by: Yan, Siming
Published: (2025)

PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
by: Zhu, Haoyi, et al.
Published: (2023)

MMDRFuse: Distilled Mini-Model with Dynamic Refresh for Multi-Modality Image Fusion
by: Deng, Yanglin, et al.
Published: (2024)

An Improved Graph Pooling Network for Skeleton-Based Action Recognition
by: Wu, Cong, et al.
Published: (2024)

Physics-Driven Local-Whole Elastic Deformation Modeling for Point Cloud Representation Learning
by: Chen, Zhongyu, et al.
Published: (2025)

Intrinsic Image Decomposition Using Point Cloud Representation
by: Xing, Xiaoyan, et al.
Published: (2023)

THU-Warwick Submission for EPIC-KITCHEN Challenge 2025: Semi-Supervised Video Object Segmentation
by: Gao, Mingqi, et al.
Published: (2025)

Dynamic Subframe Splitting and Spatio-Temporal Motion Entangled Sparse Attention for RGB-E Tracking
by: Shao, Pengcheng, et al.
Published: (2024)

MAGIC-Talk: Motion-aware Audio-Driven Talking Face Generation with Customizable Identity Control
by: Nazarieh, Fatemeh, et al.
Published: (2025)

KAN or MLP? Point Cloud Shows the Way Forward
by: Shi, Yan, et al.
Published: (2025)

A Tri-Modal Dataset and a Baseline System for Tracking Unmanned Aerial Vehicles
by: Xu, Tianyang, et al.
Published: (2025)

Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment
by: Chen, Jingkun, et al.
Published: (2026)

MF-MOS: A Motion-Focused Model for Moving Object Segmentation
by: Cheng, Jintao, et al.
Published: (2024)

Point2Vec for Self-Supervised Representation Learning on Point Clouds
by: Knaebel, Karim, et al.
Published: (2023)

Learning Progressive Adaptation for Multi-Modal Tracking
by: Wang, He, et al.
Published: (2026)

Multi-Paradigm Collaborative Adversarial Attack Against Multi-Modal Large Language Models
by: Li, Yuanbo, et al.
Published: (2026)

Towards Fusing Point Cloud and Visual Representations for Imitation Learning
by: Donat, Atalay, et al.
Published: (2025)

Novel Class Discovery for Point Cloud Segmentation via Joint Learning of Causal Representation and Reasoning
by: Li, Yang, et al.
Published: (2025)

On-the-fly Point Feature Representation for Point Clouds Analysis
by: Wang, Jiangyi, et al.
Published: (2024)

T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning
by: Wei, Weijie, et al.
Published: (2023)

A Unified Framework for Human-centric Point Cloud Video Understanding
by: Xu, Yiteng, et al.
Published: (2024)

QuoVLA: Quotient Space for Vision-Language-Action Models
by: Wang, Xuan, et al.
Published: (2026)

Re-Prompting SAM 3 via Object Retrieval: 3rd of the 5th PVUW MOSE Track
by: Gao, Mingqi, et al.
Published: (2026)

WaveFace: Authentic Face Restoration with Efficient Frequency Recovery
by: Miao, Yunqi, et al.
Published: (2024)

1st Place Solution for MeViS Track in CVPR 2024 PVUW Workshop: Motion Expression guided Video Segmentation
by: Gao, Mingqi, et al.
Published: (2024)

Point Cloud Mamba: Point Cloud Learning via State Space Model
by: Zhang, Tao, et al.
Published: (2024)

Object Dynamics Modeling with Hierarchical Point Cloud-based Representations
by: Kim, Chanho, et al.
Published: (2024)

Modality Prompts for Arbitrary Modality Salient Object Detection
by: Huang, Nianchang, et al.
Published: (2024)

Editing Physiological Signals in Videos Using Latent Representations
by: Zhou, Tianwen, et al.
Published: (2025)

Advancements in Point Cloud Data Augmentation for Deep Learning: A Survey
by: Zhu, Qinfeng, et al.
Published: (2023)