:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Tanaka, Hiroshi, Rao, Anika, Satou, Hana, Johnson, Michael, García, Sofia
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2506.12724
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Dynamic Epsilon Scheduling: A Multi-Factor Adaptive Perturbation Budget for Adversarial Training
by: Mitkiy, Alan, et al.
Published: (2025)

Learning to Fuse: Modality-Aware Adaptive Scheduling for Robust Multimodal Foundation Models
by: Bennett, Liam, et al.
Published: (2025)

Fusing Physics-Driven Strategies and Cross-Modal Adversarial Learning: Toward Multi-Domain Applications
by: Satou, Hana, et al.
Published: (2024)

GAMA: Geometry-Aware Manifold Alignment via Structured Adversarial Perturbations for Robust Domain Adaptation
by: Satou, Hana, et al.
Published: (2025)

On the Mechanisms of Adversarial Data Augmentation for Robust and Adaptive Transfer Learning
by: Satou, Hana, et al.
Published: (2025)

Geometrically Regularized Transfer Learning with On-Manifold and Off-Manifold Perturbation
by: Satou, Hana, et al.
Published: (2025)

Disentangled Geometric Alignment with Adaptive Contrastive Perturbation for Reliable Domain Transfer
by: Collins, Emma, et al.
Published: (2025)

Semantic-Preserving Cross-Style Visual Reasoning for Robust Multi-Modal Understanding in Large Vision-Language Models
by: Nakayama, Aya, et al.
Published: (2025)

Analyzing Reasoning Consistency in Large Multimodal Models under Cross-Modal Conflicts
by: Zhu, Zhihao, et al.
Published: (2026)

Asymmetric Cross-Modal Knowledge Distillation: Bridging Modalities with Weak Semantic Consistency
by: Wei, Riling, et al.
Published: (2025)

Combating Visual Neglect and Semantic Drift in Large Multimodal Models for Enhanced Cross-Modal Retrieval
by: Zhang, Guosheng, et al.
Published: (2026)

Confidence Contours: Uncertainty-Aware Annotation for Medical Semantic Segmentation
by: Ye, Andre, et al.
Published: (2023)

Self-Enhanced Image Clustering with Cross-Modal Semantic Consistency
by: Li, Zihan, et al.
Published: (2025)

Improving Multimodal Sentiment Analysis via Modality Optimization and Dynamic Primary Modality Selection
by: Yang, Dingkang, et al.
Published: (2025)

Robust Multimodal Semantic Segmentation with Balanced Modality Contributions
by: Tan, Jiaqi, et al.
Published: (2025)

Uncertainty-Participation Context Consistency Learning for Semi-supervised Semantic Segmentation
by: Yin, Jianjian, et al.
Published: (2024)

Semantic Alignment for Multimodal Large Language Models
by: Wu, Tao, et al.
Published: (2024)

TsCA: On the Semantic Consistency Alignment via Conditional Transport for Compositional Zero-Shot Learning
by: Li, Miaoge, et al.
Published: (2024)

Matching Semantically Similar Non-Identical Objects
by: Marumo, Yusuke, et al.
Published: (2024)

BiXFormer: A Robust Framework for Maximizing Modality Effectiveness in Multi-Modal Semantic Segmentation
by: Chen, Jialei, et al.
Published: (2025)

U3M: Unbiased Multiscale Modal Fusion Model for Multimodal Semantic Segmentation
by: Li, Bingyu, et al.
Published: (2024)

Genesis: Multimodal Driving Scene Generation with Spatio-Temporal and Cross-Modal Consistency
by: Guo, Xiangyu, et al.
Published: (2025)

Uncertainty-Aware Knowledge Distillation for Multimodal Large Language Models
by: Sun, Jingchen, et al.
Published: (2026)

Diffusion-Guided Semantic Consistency for Multimodal Heterogeneity
by: Liu, Jing, et al.
Published: (2026)

MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models
by: Hu, Lulu, et al.
Published: (2026)

MILES: Modality-Informed Learning Rate Scheduler for Balancing Multimodal Learning
by: Guerra-Manzanares, Alejandro, et al.
Published: (2025)

Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
by: Zhou, Guanyu, et al.
Published: (2024)

Uncertainty Quantification for Multimodal Large Language Models with Incoherence-adjusted Semantic Volume
by: Lau, Gregory Kang Ruey, et al.
Published: (2026)

Optimizing ID Consistency in Multimodal Large Models: Facial Restoration via Alignment, Entanglement, and Disentanglement
by: Dong, Yuran, et al.
Published: (2026)

Prototype-Enhanced Confidence Modeling for Cross-Modal Medical Image-Report Retrieval
by: Gowda, Shreyank N, et al.
Published: (2025)

Instinct vs. Reflection: Unifying Token and Verbalized Confidence in Multimodal Large Models
by: Dang, Yunkai, et al.
Published: (2026)

Zero-shot Action Localization via the Confidence of Large Vision-Language Models
by: Aklilu, Josiah, et al.
Published: (2024)

Uncertainty-o: One Model-agnostic Framework for Unveiling Uncertainty in Large Multimodal Models
by: Zhang, Ruiyang, et al.
Published: (2025)

Dynamic Cross-Modal Alignment for Robust Semantic Location Prediction
by: Jing, Liu, et al.
Published: (2024)

Static-Dynamic Class-level Perception Consistency in Video Semantic Segmentation
by: Cen, Zhigang, et al.
Published: (2024)

MAGIC++: Efficient and Resilient Modality-Agnostic Semantic Segmentation via Hierarchical Modality Selection
by: Zheng, Xu, et al.
Published: (2024)

Inference-Time Dynamic Modality Selection for Incomplete Multimodal Classification
by: Du, Siyi, et al.
Published: (2026)

SGMA: Semantic-Guided Modality-Aware Segmentation for Remote Sensing with Incomplete Multimodal Data
by: Wen, Lekang, et al.
Published: (2026)

Parameter-Efficient Modality-Balanced Symmetric Fusion for Multimodal Remote Sensing Semantic Segmentation
by: Li, Haocheng, et al.
Published: (2026)

Constant Rate Scheduling: A General Framework for Optimizing Diffusion Noise Schedule via Distributional Change
by: Okada, Shuntaro, et al.
Published: (2024)