:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Huangfu, Yuanxiang, Wang, Chaochao, Wang, Weilei
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.05057
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2024)

SynthVerse: A Large-Scale Diverse Synthetic Dataset for Point Tracking
by: Zhao, Weiguang, et al.
Published: (2026)

Language Plays a Pivotal Role in the Object-Attribute Compositional Generalization of CLIP
by: Abbasi, Reza, et al.
Published: (2024)

Synth-Align: Improving Trustworthiness in Vision-Language Model with Synthetic Preference Data Alignment
by: Wijaya, Robert, et al.
Published: (2024)

Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios
by: Marcus, Richard, et al.
Published: (2025)

SRL-CLIP: Efficient CLIP Video Adaptation via Structured Semantic Role Labels
by: Singh, Darshan, et al.
Published: (2024)

SoccerSynth-Detection: A Synthetic Dataset for Soccer Player Detection
by: Qin, Haobin, et al.
Published: (2025)

Scene Graph Generation with Role-Playing Large Language Models
by: Chen, Guikun, et al.
Published: (2024)

SynPlay: Large-Scale Synthetic Human Data with Real-World Diversity for Aerial-View Perception
by: Yim, Jinsub, et al.
Published: (2024)

Know "No" Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP
by: Park, Junsung, et al.
Published: (2025)

AnySynth: Harnessing the Power of Image Synthetic Data Generation for Generalized Vision-Language Tasks
by: Li, You, et al.
Published: (2024)

CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions
by: Liu, Yanqing, et al.
Published: (2024)

SynthPID: P&ID digitization from Topology-Preserving Synthetic Data
by: Prasad, Suraj, et al.
Published: (2026)

AmodalSynthDrive: A Synthetic Amodal Perception Dataset for Autonomous Driving
by: Sekkat, Ahmed Rida, et al.
Published: (2023)

SynthForensics: Benchmarking and Evaluating People-Centric Synthetic Video Deepfakes
by: Leotta, Roberto, et al.
Published: (2026)

SynthSeg-Agents: Multi-Agent Synthetic Data Generation for Zero-Shot Weakly Supervised Semantic Segmentation
by: Wu, Wangyu, et al.
Published: (2025)

SynthRAR: Ring Artifacts Reduction in CT with Unrolled Network and Synthetic Data Training
by: Yang, Hongxu, et al.
Published: (2026)

Deciphering the Role of Representation Disentanglement: Investigating Compositional Generalization in CLIP Models
by: Abbasi, Reza, et al.
Published: (2024)

Through the Lens of Character: Resolving Modality-Role Interference in Multimodal Role-Playing Agent
by: Tang, Yihong, et al.
Published: (2026)

psPRF:Pansharpening Planar Neural Radiance Field for Generalized 3D Reconstruction Satellite Imagery
by: Zhang, Tongtong, et al.
Published: (2024)

Data Distribution Distilled Generative Model for Generalized Zero-Shot Recognition
by: Wang, Yijie, et al.
Published: (2024)

Cross-Lingual SynthDocs: A Large-Scale Synthetic Corpus for Any to Arabic OCR and Document Understanding
by: Al-Homoud, Haneen, et al.
Published: (2025)

CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification
by: Abdelfattah, Rabab, et al.
Published: (2023)

Zero-Shot Class Unlearning in CLIP with Synthetic Samples
by: Kravets, A., et al.
Published: (2024)

CultureCLIP: Empowering CLIP with Cultural Awareness through Synthetic Images and Contextualized Captions
by: Huang, Yuchen, et al.
Published: (2025)

TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
by: Patel, Maitreya, et al.
Published: (2024)

SAM-MIL: A Spatial Contextual Aware Multiple Instance Learning Approach for Whole Slide Image Classification
by: Fang, Heng, et al.
Published: (2024)

SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces
by: Chaturvedi, Sumit, et al.
Published: (2025)

RoleMotion: A Large-Scale Dataset towards Robust Scene-Specific Role-Playing Motion Synthesis with Fine-grained Descriptions
by: Peng, Junran, et al.
Published: (2025)

NeuroCLIP: Neuromorphic Data Understanding by CLIP and SNN
by: Guo, Yufei, et al.
Published: (2023)

Harnessing Textual Semantic Priors for Knowledge Transfer and Refinement in CLIP-Driven Continual Learning
by: He, Lingfeng, et al.
Published: (2025)

Synth$^2$: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings
by: Sharifzadeh, Sahand, et al.
Published: (2024)

ARM: A Learnable, Plug-and-Play Module for CLIP-based Open-vocabulary Semantic Segmentation
by: Liu, Ziquan, et al.
Published: (2025)

SuperCLIP: CLIP with Simple Classification Supervision
by: Zhao, Weiheng, et al.
Published: (2025)

Restormer-Plus for Real World Image Deraining: One State-of-the-Art Solution to the GT-RAIN Challenge (CVPR 2023 UG2+ Track 3)
by: Zheng, Chaochao, et al.
Published: (2023)

IPAD-CLIP: Teaching CLIP to Detect Image Local Perceptual Artifacts
by: Wang, Juan, et al.
Published: (2026)

GUESS:GradUally Enriching SyntheSis for Text-Driven Human Motion Generation
by: Gao, Xuehao, et al.
Published: (2024)

Exploring the Role of Synthetic Data Augmentation in Controllable Human-Centric Video Generation
by: Fei, Yuanchen, et al.
Published: (2026)

ReCLIP++: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation
by: Wang, Jingyun, et al.
Published: (2024)

Infrared and Visible Image Fusion with Language-Driven Loss in CLIP Embedding Space
by: Wang, Yuhao, et al.
Published: (2024)