:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhan, Yuliang, Li, Jian, Huang, Wenbing, Liu, Yang, Sun, Hao
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.01844
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

CloSe: A 3D Clothing Segmentation Dataset and Model
by: Antić, Dimitrije, et al.
Published: (2024)

Shedding Light on VLN Robustness: A Black-box Framework for Indoor Lighting-based Adversarial Attack
by: Li, Chenyang, et al.
Published: (2025)

Recurrent Reasoning with Vision-Language Models for Estimating Long-Horizon Embodied Task Progress
by: Zhang, Yuelin, et al.
Published: (2026)

Cross-View Referring Multi-Object Tracking
by: Chen, Sijia, et al.
Published: (2024)

PG-NeuS: Robust and Efficient Point Guidance for Multi-View Neural Surface Reconstruction
by: Zhang, Chen, et al.
Published: (2023)

DRMOT: A Dataset and Framework for RGBD Referring Multi-Object Tracking
by: Chen, Sijia, et al.
Published: (2026)

Accountable Textual-Visual Chat Learns to Reject Human Instructions in Image Re-creation
by: Zhang, Zhiwei, et al.
Published: (2023)

CHRIS: Clothed Human Reconstruction with Side View Consistency
by: Liu, Dong, et al.
Published: (2025)

CloSET: Modeling Clothed Humans on Continuous Surface with Explicit Template Decomposition
by: Zhang, Hongwen, et al.
Published: (2023)

Work Zones challenge VLM Trajectory Planning: Toward Mitigation and Robust Autonomous Driving
by: Liao, Yifan, et al.
Published: (2025)

ETCH: Generalizing Body Fitting to Clothed Humans via Equivariant Tightness
by: Li, Boqian, et al.
Published: (2025)

CluMo: Cluster-based Modality Fusion Prompt for Continual Learning in Visual Question Answering
by: Cai, Yuliang, et al.
Published: (2024)

Visual Self-paced Iterative Learning for Unsupervised Temporal Action Localization
by: Hu, Yupeng, et al.
Published: (2023)

BrainMem: Brain-Inspired Evolving Memory for Embodied Agent Task Planning
by: Ma, Xiaoyu, et al.
Published: (2026)

DGFamba: Learning Flow Factorized State Space for Visual Domain Generalization
by: Bi, Qi, et al.
Published: (2025)

Saliency-Guided Representation with Consistency Policy Learning for Visual Unsupervised Reinforcement Learning
by: Sun, Jingbo, et al.
Published: (2026)

ClothHMR: 3D Mesh Recovery of Humans in Diverse Clothing from Single Image
by: Gao, Yunqi, et al.
Published: (2025)

Poivre: Self-Refining Visual Pointing with Reinforcement Learning
by: Yang, Wenjie, et al.
Published: (2025)

Cluster Contrast for Unsupervised Visual Representation Learning
by: Giakoumoglou, Nikolaos, et al.
Published: (2025)

SeaBird: Segmentation in Bird's View with Dice Loss Improves Monocular 3D Detection of Large Objects
by: Kumar, Abhinav, et al.
Published: (2024)

Pretrained Reversible Generation as Unsupervised Visual Representation Learning
by: Xue, Rongkun, et al.
Published: (2024)

SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes
by: Sanyal, Soubhik, et al.
Published: (2023)

CloSE: A Geometric Shape-Agnostic Cloth State Representation
by: Kamat, Jay, et al.
Published: (2025)

Unsupervised Audio-Visual Segmentation with Modality Alignment
by: Bhosale, Swapnil, et al.
Published: (2024)

Physically Realistic Sequence-Level Adversarial Clothing for Robust Human-Detection Evasion
by: Zhou, Dingkun, et al.
Published: (2025)

Unsupervised Point Cloud Pre-Training via Contrasting and Clustering
by: Mei, Guofeng, et al.
Published: (2022)

ClothPPO: A Proximal Policy Optimization Enhancing Framework for Robotic Cloth Manipulation with Observation-Aligned Action Spaces
by: Yang, Libing, et al.
Published: (2024)

Source-Free and Image-Only Unsupervised Domain Adaptation for Category Level Object Pose Estimation
by: Kaushik, Prakhar, et al.
Published: (2024)

AvatarShield: Visual Reinforcement Learning for Human-Centric Synthetic Video Detection
by: Xu, Zhipei, et al.
Published: (2025)

Leveraging Unsupervised Learning for Cost-Effective Visual Anomaly Detection
by: Long, Yunbo, et al.
Published: (2024)

Unsupervised Anomaly Detection in Brain MRI via Disentangled Anatomy Learning
by: Yang, Tao, et al.
Published: (2025)

OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
by: Fu, Ling, et al.
Published: (2024)

VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization
by: Liu, Yuliang, et al.
Published: (2024)

Rethinking Alignment and Uniformity in Unsupervised Semantic Segmentation
by: Zhang, Daoan, et al.
Published: (2022)

OpenGround: Active Cognition-based Reasoning for Open-World 3D Visual Grounding
by: Huang, Wenyuan, et al.
Published: (2025)

Hierarchical Semantic Correlation-Aware Masked Autoencoder for Unsupervised Audio-Visual Representation Learning
by: Zeng, Donghuo, et al.
Published: (2026)

Semantic Is Enough: Only Semantic Information For NeRF Reconstruction
by: Wang, Ruibo, et al.
Published: (2024)

DS$^2$Net: Detail-Semantic Deep Supervision Network for Medical Image Segmentation
by: Huang, Zhaohong, et al.
Published: (2025)

Reasoning-Enhanced Object-Centric Learning for Videos
by: Li, Jian, et al.
Published: (2024)

Online Language Splatting
by: Katragadda, Saimouli, et al.
Published: (2025)