Saved in:
| Main Authors: | Huangfu, Yuanxiang, Wang, Chaochao, Wang, Weilei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.05057 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2024)
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2024)
SynthVerse: A Large-Scale Diverse Synthetic Dataset for Point Tracking
by: Zhao, Weiguang, et al.
Published: (2026)
by: Zhao, Weiguang, et al.
Published: (2026)
Language Plays a Pivotal Role in the Object-Attribute Compositional Generalization of CLIP
by: Abbasi, Reza, et al.
Published: (2024)
by: Abbasi, Reza, et al.
Published: (2024)
Synth-Align: Improving Trustworthiness in Vision-Language Model with Synthetic Preference Data Alignment
by: Wijaya, Robert, et al.
Published: (2024)
by: Wijaya, Robert, et al.
Published: (2024)
Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios
by: Marcus, Richard, et al.
Published: (2025)
by: Marcus, Richard, et al.
Published: (2025)
SRL-CLIP: Efficient CLIP Video Adaptation via Structured Semantic Role Labels
by: Singh, Darshan, et al.
Published: (2024)
by: Singh, Darshan, et al.
Published: (2024)
SoccerSynth-Detection: A Synthetic Dataset for Soccer Player Detection
by: Qin, Haobin, et al.
Published: (2025)
by: Qin, Haobin, et al.
Published: (2025)
Scene Graph Generation with Role-Playing Large Language Models
by: Chen, Guikun, et al.
Published: (2024)
by: Chen, Guikun, et al.
Published: (2024)
SynPlay: Large-Scale Synthetic Human Data with Real-World Diversity for Aerial-View Perception
by: Yim, Jinsub, et al.
Published: (2024)
by: Yim, Jinsub, et al.
Published: (2024)
Know "No" Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP
by: Park, Junsung, et al.
Published: (2025)
by: Park, Junsung, et al.
Published: (2025)
AnySynth: Harnessing the Power of Image Synthetic Data Generation for Generalized Vision-Language Tasks
by: Li, You, et al.
Published: (2024)
by: Li, You, et al.
Published: (2024)
CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions
by: Liu, Yanqing, et al.
Published: (2024)
by: Liu, Yanqing, et al.
Published: (2024)
SynthPID: P&ID digitization from Topology-Preserving Synthetic Data
by: Prasad, Suraj, et al.
Published: (2026)
by: Prasad, Suraj, et al.
Published: (2026)
AmodalSynthDrive: A Synthetic Amodal Perception Dataset for Autonomous Driving
by: Sekkat, Ahmed Rida, et al.
Published: (2023)
by: Sekkat, Ahmed Rida, et al.
Published: (2023)
SynthForensics: Benchmarking and Evaluating People-Centric Synthetic Video Deepfakes
by: Leotta, Roberto, et al.
Published: (2026)
by: Leotta, Roberto, et al.
Published: (2026)
SynthSeg-Agents: Multi-Agent Synthetic Data Generation for Zero-Shot Weakly Supervised Semantic Segmentation
by: Wu, Wangyu, et al.
Published: (2025)
by: Wu, Wangyu, et al.
Published: (2025)
SynthRAR: Ring Artifacts Reduction in CT with Unrolled Network and Synthetic Data Training
by: Yang, Hongxu, et al.
Published: (2026)
by: Yang, Hongxu, et al.
Published: (2026)
Deciphering the Role of Representation Disentanglement: Investigating Compositional Generalization in CLIP Models
by: Abbasi, Reza, et al.
Published: (2024)
by: Abbasi, Reza, et al.
Published: (2024)
Through the Lens of Character: Resolving Modality-Role Interference in Multimodal Role-Playing Agent
by: Tang, Yihong, et al.
Published: (2026)
by: Tang, Yihong, et al.
Published: (2026)
psPRF:Pansharpening Planar Neural Radiance Field for Generalized 3D Reconstruction Satellite Imagery
by: Zhang, Tongtong, et al.
Published: (2024)
by: Zhang, Tongtong, et al.
Published: (2024)
Data Distribution Distilled Generative Model for Generalized Zero-Shot Recognition
by: Wang, Yijie, et al.
Published: (2024)
by: Wang, Yijie, et al.
Published: (2024)
Cross-Lingual SynthDocs: A Large-Scale Synthetic Corpus for Any to Arabic OCR and Document Understanding
by: Al-Homoud, Haneen, et al.
Published: (2025)
by: Al-Homoud, Haneen, et al.
Published: (2025)
CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification
by: Abdelfattah, Rabab, et al.
Published: (2023)
by: Abdelfattah, Rabab, et al.
Published: (2023)
Zero-Shot Class Unlearning in CLIP with Synthetic Samples
by: Kravets, A., et al.
Published: (2024)
by: Kravets, A., et al.
Published: (2024)
CultureCLIP: Empowering CLIP with Cultural Awareness through Synthetic Images and Contextualized Captions
by: Huang, Yuchen, et al.
Published: (2025)
by: Huang, Yuchen, et al.
Published: (2025)
TripletCLIP: Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives
by: Patel, Maitreya, et al.
Published: (2024)
by: Patel, Maitreya, et al.
Published: (2024)
SAM-MIL: A Spatial Contextual Aware Multiple Instance Learning Approach for Whole Slide Image Classification
by: Fang, Heng, et al.
Published: (2024)
by: Fang, Heng, et al.
Published: (2024)
SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces
by: Chaturvedi, Sumit, et al.
Published: (2025)
by: Chaturvedi, Sumit, et al.
Published: (2025)
RoleMotion: A Large-Scale Dataset towards Robust Scene-Specific Role-Playing Motion Synthesis with Fine-grained Descriptions
by: Peng, Junran, et al.
Published: (2025)
by: Peng, Junran, et al.
Published: (2025)
NeuroCLIP: Neuromorphic Data Understanding by CLIP and SNN
by: Guo, Yufei, et al.
Published: (2023)
by: Guo, Yufei, et al.
Published: (2023)
Harnessing Textual Semantic Priors for Knowledge Transfer and Refinement in CLIP-Driven Continual Learning
by: He, Lingfeng, et al.
Published: (2025)
by: He, Lingfeng, et al.
Published: (2025)
Synth$^2$: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings
by: Sharifzadeh, Sahand, et al.
Published: (2024)
by: Sharifzadeh, Sahand, et al.
Published: (2024)
ARM: A Learnable, Plug-and-Play Module for CLIP-based Open-vocabulary Semantic Segmentation
by: Liu, Ziquan, et al.
Published: (2025)
by: Liu, Ziquan, et al.
Published: (2025)
SuperCLIP: CLIP with Simple Classification Supervision
by: Zhao, Weiheng, et al.
Published: (2025)
by: Zhao, Weiheng, et al.
Published: (2025)
Restormer-Plus for Real World Image Deraining: One State-of-the-Art Solution to the GT-RAIN Challenge (CVPR 2023 UG2+ Track 3)
by: Zheng, Chaochao, et al.
Published: (2023)
by: Zheng, Chaochao, et al.
Published: (2023)
IPAD-CLIP: Teaching CLIP to Detect Image Local Perceptual Artifacts
by: Wang, Juan, et al.
Published: (2026)
by: Wang, Juan, et al.
Published: (2026)
GUESS:GradUally Enriching SyntheSis for Text-Driven Human Motion Generation
by: Gao, Xuehao, et al.
Published: (2024)
by: Gao, Xuehao, et al.
Published: (2024)
Exploring the Role of Synthetic Data Augmentation in Controllable Human-Centric Video Generation
by: Fei, Yuanchen, et al.
Published: (2026)
by: Fei, Yuanchen, et al.
Published: (2026)
ReCLIP++: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation
by: Wang, Jingyun, et al.
Published: (2024)
by: Wang, Jingyun, et al.
Published: (2024)
Infrared and Visible Image Fusion with Language-Driven Loss in CLIP Embedding Space
by: Wang, Yuhao, et al.
Published: (2024)
by: Wang, Yuhao, et al.
Published: (2024)
Similar Items
-
SynthCLIP: Are We Ready for a Fully Synthetic CLIP Training?
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2024) -
SynthVerse: A Large-Scale Diverse Synthetic Dataset for Point Tracking
by: Zhao, Weiguang, et al.
Published: (2026) -
Language Plays a Pivotal Role in the Object-Attribute Compositional Generalization of CLIP
by: Abbasi, Reza, et al.
Published: (2024) -
Synth-Align: Improving Trustworthiness in Vision-Language Model with Synthetic Preference Data Alignment
by: Wijaya, Robert, et al.
Published: (2024) -
Synth It Like KITTI: Synthetic Data Generation for Object Detection in Driving Scenarios
by: Marcus, Richard, et al.
Published: (2025)