Saved in:
| Main Authors: | Zhang, Tong, Shen, Shu, Chen, C. L. Philip |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.19674 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Logits DeConfusion with CLIP for Few-Shot Learning
by: Li, Shuo, et al.
Published: (2025)
by: Li, Shuo, et al.
Published: (2025)
Multi-QuAD: Multi-Level Quality-Adaptive Dynamic Network for Reliable Multimodal Classification
by: Shen, Shu, et al.
Published: (2024)
by: Shen, Shu, et al.
Published: (2024)
Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification
by: Shen, Shu, et al.
Published: (2026)
by: Shen, Shu, et al.
Published: (2026)
AIM: Adaptive Intra-Network Modulation for Balanced Multimodal Learning
by: Shen, Shu, et al.
Published: (2025)
by: Shen, Shu, et al.
Published: (2025)
Uncovering Entity Identity Confusion in Multimodal Knowledge Editing
by: Wu, Shu, et al.
Published: (2026)
by: Wu, Shu, et al.
Published: (2026)
De-Confusing Pseudo-Labels in Source-Free Domain Adaptation
by: Diamant, Idit, et al.
Published: (2024)
by: Diamant, Idit, et al.
Published: (2024)
CUE: Concept-Aware Multi-Label Expansion to Mitigate Concept Confusion in Long-Tailed Learning
by: Zhang, Ruichi, et al.
Published: (2026)
by: Zhang, Ruichi, et al.
Published: (2026)
DeconfuseTrack:Dealing with Confusion for Multi-Object Tracking
by: Huang, Cheng, et al.
Published: (2024)
by: Huang, Cheng, et al.
Published: (2024)
ConfusionBench: An Expert-Validated Benchmark for Confusion Recognition and Localization in Educational Videos
by: Dong, Lu, et al.
Published: (2026)
by: Dong, Lu, et al.
Published: (2026)
Backdooring CLIP through Concept Confusion
by: Hu, Lijie, et al.
Published: (2025)
by: Hu, Lijie, et al.
Published: (2025)
Closing the Confusion Loop: CLIP-Guided Alignment for Source-Free Domain Adaptation
by: Wang, Shanshan, et al.
Published: (2026)
by: Wang, Shanshan, et al.
Published: (2026)
Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image Generation
by: Huang, Qihan, et al.
Published: (2024)
by: Huang, Qihan, et al.
Published: (2024)
MSF: Efficient Diffusion Model Via Multi-Scale Latent Factorize
by: Xu, Haohang, et al.
Published: (2025)
by: Xu, Haohang, et al.
Published: (2025)
CARE: Class-Adaptive Expert Consensus for Reliable Learning with Long-Tailed Noisy Labels
by: Li, Mengke, et al.
Published: (2026)
by: Li, Mengke, et al.
Published: (2026)
Windsock is Dancing: Adaptive Multimodal Retrieval-Augmented Generation
by: Zhao, Shu, et al.
Published: (2025)
by: Zhao, Shu, et al.
Published: (2025)
Reliable Reasoning in SVG-LLMs via Multi-Task Multi-Reward Reinforcement Learning
by: Wang, Haomin, et al.
Published: (2026)
by: Wang, Haomin, et al.
Published: (2026)
Multi-task Gaze Estimation Via Unidirectional Convolution
by: Cheng, Zhang, et al.
Published: (2024)
by: Cheng, Zhang, et al.
Published: (2024)
Phantom-Insight: Adaptive Multi-cue Fusion for Video Camouflaged Object Detection with Multimodal LLM
by: Zhang, Hua, et al.
Published: (2025)
by: Zhang, Hua, et al.
Published: (2025)
Adaptive Learning for Multi-view Stereo Reconstruction
by: Min, Qinglu, et al.
Published: (2024)
by: Min, Qinglu, et al.
Published: (2024)
MultiPull: Detailing Signed Distance Functions by Pulling Multi-Level Queries at Multi-Step
by: Noda, Takeshi, et al.
Published: (2024)
by: Noda, Takeshi, et al.
Published: (2024)
Reliable Representation Learning for Incomplete Multi-View Missing Multi-Label Classification
by: Liu, Chengliang, et al.
Published: (2023)
by: Liu, Chengliang, et al.
Published: (2023)
CATP: Contextually Adaptive Token Pruning for Efficient and Enhanced Multimodal In-Context Learning
by: Li, Yanshu, et al.
Published: (2025)
by: Li, Yanshu, et al.
Published: (2025)
CrypticBio: A Large Multimodal Dataset for Visually Confusing Biodiversity
by: Manolache, Georgiana, et al.
Published: (2025)
by: Manolache, Georgiana, et al.
Published: (2025)
Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual Recognition
by: Liu, Haiqi, et al.
Published: (2023)
by: Liu, Haiqi, et al.
Published: (2023)
Vision Language Models are Confused Tourists
by: Irawan, Patrick Amadeus, et al.
Published: (2025)
by: Irawan, Patrick Amadeus, et al.
Published: (2025)
Adaptive Deep Learning for Breast Cancer Subtype Prediction Via Misprediction Risk Analysis
by: Sheeraz, Gul, et al.
Published: (2025)
by: Sheeraz, Gul, et al.
Published: (2025)
Exploring Token-Level Augmentation in Vision Transformer for Semi-Supervised Semantic Segmentation
by: Zhang, Dengke, et al.
Published: (2025)
by: Zhang, Dengke, et al.
Published: (2025)
Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using Agents
by: Liu, Xunzhuo, et al.
Published: (2026)
by: Liu, Xunzhuo, et al.
Published: (2026)
Adaptive Multi-step Refinement Network for Robust Point Cloud Registration
by: Chen, Zhi, et al.
Published: (2023)
by: Chen, Zhi, et al.
Published: (2023)
User Experience Estimation in Human-Robot Interaction Via Multi-Instance Learning of Multimodal Social Signals
by: Miyoshi, Ryo, et al.
Published: (2025)
by: Miyoshi, Ryo, et al.
Published: (2025)
Universal Incremental Learning: Mitigating Confusion from Inter- and Intra-task Distribution Randomness
by: Luo, Sheng, et al.
Published: (2025)
by: Luo, Sheng, et al.
Published: (2025)
M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality Assessment
by: Cui, Chuan, et al.
Published: (2025)
by: Cui, Chuan, et al.
Published: (2025)
Docopilot: Improving Multimodal Models for Document-Level Understanding
by: Duan, Yuchen, et al.
Published: (2025)
by: Duan, Yuchen, et al.
Published: (2025)
The Comparability of Model Fusion to Measured Data in Confuser Rejection
by: Flynn, Conor, et al.
Published: (2025)
by: Flynn, Conor, et al.
Published: (2025)
Evaluating Attribute Confusion in Fashion Text-to-Image Generation
by: Liu, Ziyue, et al.
Published: (2025)
by: Liu, Ziyue, et al.
Published: (2025)
ViaRL: Adaptive Temporal Grounding via Visual Iterated Amplification Reinforcement Learning
by: Xu, Ziqiang, et al.
Published: (2025)
by: Xu, Ziqiang, et al.
Published: (2025)
Learning A Multi-Task Transformer Via Unified And Customized Instruction Tuning For Chest Radiograph Interpretation
by: Xu, Lijian, et al.
Published: (2023)
by: Xu, Lijian, et al.
Published: (2023)
Coupled Confusion Correction: Learning from Crowds with Sparse Annotations
by: Zhang, Hansong, et al.
Published: (2023)
by: Zhang, Hansong, et al.
Published: (2023)
Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following
by: Xiong, Tianyi, et al.
Published: (2025)
by: Xiong, Tianyi, et al.
Published: (2025)
ASPO: Adaptive Sentence-Level Preference Optimization for Fine-Grained Multimodal Reasoning
by: Wang, Yeyuan, et al.
Published: (2025)
by: Wang, Yeyuan, et al.
Published: (2025)
Similar Items
-
Logits DeConfusion with CLIP for Few-Shot Learning
by: Li, Shuo, et al.
Published: (2025) -
Multi-QuAD: Multi-Level Quality-Adaptive Dynamic Network for Reliable Multimodal Classification
by: Shen, Shu, et al.
Published: (2024) -
Test-time Adaptive Hierarchical Co-enhanced Denoising Network for Reliable Multimodal Classification
by: Shen, Shu, et al.
Published: (2026) -
AIM: Adaptive Intra-Network Modulation for Balanced Multimodal Learning
by: Shen, Shu, et al.
Published: (2025) -
Uncovering Entity Identity Confusion in Multimodal Knowledge Editing
by: Wu, Shu, et al.
Published: (2026)