Saved in:
| Main Authors: | Kang, Bin, Chen, Bin, Wang, Junjie, Li, Yulin, Zhao, Junzhi, Tian, Zhuotao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.05586 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
by: Wang, Junjie, et al.
Published: (2025)
by: Wang, Junjie, et al.
Published: (2025)
AgentSteerTTS: A Multi-Agent Closed-Loop Framework for Composite-Instruction Text-to-Speech
by: Kang, Bin, et al.
Published: (2026)
by: Kang, Bin, et al.
Published: (2026)
Multi-path Exploration and Feedback Adjustment for Text-to-Image Person Retrieval
by: Kang, Bin, et al.
Published: (2024)
by: Kang, Bin, et al.
Published: (2024)
Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
by: Li, Yulin, et al.
Published: (2025)
by: Li, Yulin, et al.
Published: (2025)
Generalized Decoupled Learning for Enhancing Open-Vocabulary Dense Perception
by: Wang, Junjie, et al.
Published: (2025)
by: Wang, Junjie, et al.
Published: (2025)
Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation
by: Shao, Tong, et al.
Published: (2024)
by: Shao, Tong, et al.
Published: (2024)
EF-Calib: Spatiotemporal Calibration of Event- and Frame-Based Cameras Using Continuous-Time Trajectories
by: Wang, Shaoan, et al.
Published: (2024)
by: Wang, Shaoan, et al.
Published: (2024)
Benchmarking and Evolving Reason-Reflect-Rectify for Reflective Visual Generation
by: Wang, Junjie, et al.
Published: (2026)
by: Wang, Junjie, et al.
Published: (2026)
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
by: Wang, Junjie, et al.
Published: (2024)
by: Wang, Junjie, et al.
Published: (2024)
CalibAnyView: Beyond Single-View Camera Calibration in the Wild
by: Li, Boying, et al.
Published: (2026)
by: Li, Boying, et al.
Published: (2026)
ReCLIP++: Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation
by: Wang, Jingyun, et al.
Published: (2024)
by: Wang, Jingyun, et al.
Published: (2024)
Multi-Perspective Subimage CLIP with Keyword Guidance for Remote Sensing Image-Text Retrieval
by: Li, Yifan, et al.
Published: (2026)
by: Li, Yifan, et al.
Published: (2026)
Contrast-Aware Calibration for Fine-Tuned CLIP: Leveraging Image-Text Alignment
by: Lv, Song-Lin, et al.
Published: (2025)
by: Lv, Song-Lin, et al.
Published: (2025)
Click-Calib: A Robust Extrinsic Calibration Method for Surround-View Systems
by: Wang, Lihao
Published: (2025)
by: Wang, Lihao
Published: (2025)
Parrot Captions Teach CLIP to Spot Text
by: Lin, Yiqi, et al.
Published: (2023)
by: Lin, Yiqi, et al.
Published: (2023)
PTZ-Calib: Robust Pan-Tilt-Zoom Camera Calibration
by: Guo, Jinhui, et al.
Published: (2025)
by: Guo, Jinhui, et al.
Published: (2025)
GeoCalib: Learning Single-image Calibration with Geometric Optimization
by: Veicht, Alexander, et al.
Published: (2024)
by: Veicht, Alexander, et al.
Published: (2024)
RC-AutoCalib: An End-to-End Radar-Camera Automatic Calibration Network
by: Luu, Van-Tin, et al.
Published: (2025)
by: Luu, Van-Tin, et al.
Published: (2025)
From Mapping to Composing: A Two-Stage Framework for Zero-shot Composed Image Retrieval
by: Wang, Yabing, et al.
Published: (2025)
by: Wang, Yabing, et al.
Published: (2025)
OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation
by: Peng, Bohao, et al.
Published: (2024)
by: Peng, Bohao, et al.
Published: (2024)
Extending CLIP's Image-Text Alignment to Referring Image Segmentation
by: Kim, Seoyeon, et al.
Published: (2023)
by: Kim, Seoyeon, et al.
Published: (2023)
FoCLIP: A Feature-Space Misalignment Framework for CLIP-Based Image Manipulation and Detection
by: Chen, Yulin, et al.
Published: (2025)
by: Chen, Yulin, et al.
Published: (2025)
SSR: Semantic and Spatial Rectification for CLIP-based Weakly Supervised Segmentation
by: Bi, Xiuli, et al.
Published: (2025)
by: Bi, Xiuli, et al.
Published: (2025)
GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding
by: Wang, Chengyao, et al.
Published: (2024)
by: Wang, Chengyao, et al.
Published: (2024)
Tensor-Based Self-Calibration of Cameras via the TrifocalCalib Method
by: Schroeder, Gregory, et al.
Published: (2025)
by: Schroeder, Gregory, et al.
Published: (2025)
Ultrasound-CLIP: Semantic-Aware Contrastive Pre-training for Ultrasound Image-Text Understanding
by: Jin, Jiayun, et al.
Published: (2026)
by: Jin, Jiayun, et al.
Published: (2026)
RAVES-Calib: Robust, Accurate and Versatile Extrinsic Self Calibration Using Optimal Geometric Features
by: Zhang, Haoxin, et al.
Published: (2025)
by: Zhang, Haoxin, et al.
Published: (2025)
Consistency Beyond Contrast: Enhancing Open-Vocabulary Object Detection Robustness via Contextual Consistency Learning
by: Li, Bozhao, et al.
Published: (2026)
by: Li, Bozhao, et al.
Published: (2026)
CalibFormer: A Transformer-based Automatic LiDAR-Camera Calibration Network
by: Xiao, Yuxuan, et al.
Published: (2023)
by: Xiao, Yuxuan, et al.
Published: (2023)
Text-guided Image Restoration and Semantic Enhancement for Text-to-Image Person Retrieval
by: Liu, Delong, et al.
Published: (2023)
by: Liu, Delong, et al.
Published: (2023)
E-Calib: A Fast, Robust and Accurate Calibration Toolbox for Event Cameras
by: Salah, Mohammed, et al.
Published: (2023)
by: Salah, Mohammed, et al.
Published: (2023)
CasCalib: Cascaded Calibration for Motion Capture from Sparse Unsynchronized Cameras
by: Tang, James, et al.
Published: (2024)
by: Tang, James, et al.
Published: (2024)
AnyCalib: On-Manifold Learning for Model-Agnostic Single-View Camera Calibration
by: Tirado-Garín, Javier, et al.
Published: (2025)
by: Tirado-Garín, Javier, et al.
Published: (2025)
DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation
by: He, Xiankang, et al.
Published: (2024)
by: He, Xiankang, et al.
Published: (2024)
CultureCLIP: Empowering CLIP with Cultural Awareness through Synthetic Images and Contextualized Captions
by: Huang, Yuchen, et al.
Published: (2025)
by: Huang, Yuchen, et al.
Published: (2025)
ARC-Calib: Autonomous Markerless Camera-to-Robot Calibration via Exploratory Robot Motions
by: Chanrungmaneekul, Podshara, et al.
Published: (2025)
by: Chanrungmaneekul, Podshara, et al.
Published: (2025)
SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
by: Li, Wei, et al.
Published: (2025)
by: Li, Wei, et al.
Published: (2025)
MirrorCalib: Utilizing Human Pose Information for Mirror-based Virtual Camera Calibration
by: Liao, Longyun, et al.
Published: (2023)
by: Liao, Longyun, et al.
Published: (2023)
EmoCaliber: Advancing Reliable Visual Emotion Comprehension via Confidence Verbalization and Calibration
by: Wu, Daiqing, et al.
Published: (2025)
by: Wu, Daiqing, et al.
Published: (2025)
MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment
by: Das, Anurag, et al.
Published: (2024)
by: Das, Anurag, et al.
Published: (2024)
Similar Items
-
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
by: Wang, Junjie, et al.
Published: (2025) -
AgentSteerTTS: A Multi-Agent Closed-Loop Framework for Composite-Instruction Text-to-Speech
by: Kang, Bin, et al.
Published: (2026) -
Multi-path Exploration and Feedback Adjustment for Text-to-Image Person Retrieval
by: Kang, Bin, et al.
Published: (2024) -
Less Is More, but Where? Dynamic Token Compression via LLM-Guided Keyframe Prior
by: Li, Yulin, et al.
Published: (2025) -
Generalized Decoupled Learning for Enhancing Open-Vocabulary Dense Perception
by: Wang, Junjie, et al.
Published: (2025)