:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhu, Hongye, Liu, Xuan, Ba, Yanwen, Xue, Jingye, Zhang, Shigeng
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2511.23070
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Can multimodal representation learning by alignment preserve modality-specific information?
by: Thoreau, Romain, et al.
Published: (2025)

Selective experience replay compression using coresets for lifelong deep reinforcement learning in medical imaging
by: Zheng, Guangyao, et al.
Published: (2023)

MULTIAQUA: A multimodal maritime dataset and robust training strategies for multimodal semantic segmentation
by: Muhovič, Jon, et al.
Published: (2025)

Cross-modal feature fusion for robust point cloud registration with ambiguous geometry
by: Wang, Zhaoyi, et al.
Published: (2025)

Spectral regularization for adversarially-robust representation learning
by: Yang, Sheng, et al.
Published: (2024)

SELECTOR: Heterogeneous graph network with convolutional masked autoencoder for multimodal robust prediction of cancer survival
by: Pan, Liangrui, et al.
Published: (2024)

Multi-modal learning for geospatial vegetation forecasting
by: Benson, Vitus, et al.
Published: (2023)

CD-Buffer: Complementary Dual-Buffer Framework for Test-Time Adaptation in Adverse Weather Object Detection
by: Song, Youngjun, et al.
Published: (2026)

Enhancing multimodal cooperation via sample-level modality valuation
by: Wei, Yake, et al.
Published: (2023)

Buffer layers for Test-Time Adaptation
by: Kim, Hyeongyu, et al.
Published: (2025)

Improving deep learning with prior knowledge and cognitive models: A survey on enhancing explainability, adversarial robustness and zero-shot learning
by: Mumuni, Fuseinin, et al.
Published: (2024)

Review of multimodal machine learning approaches in healthcare
by: Krones, Felix, et al.
Published: (2024)

What do vision-language models see in the context? Investigating multimodal in-context learning
by: Santos, Gabriel O. dos, et al.
Published: (2025)

What to align in multimodal contrastive learning?
by: Dufumier, Benoit, et al.
Published: (2024)

Multi-modal Co-learning for Earth Observation: Enhancing single-modality models via modality collaboration
by: Mena, Francisco, et al.
Published: (2025)

Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training
by: Zhang, Jiacheng, et al.
Published: (2024)

Learnable Cross-modal Knowledge Distillation for Multi-modal Learning with Missing Modality
by: Wang, Hu, et al.
Published: (2023)

VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm
by: Wu, Zhenkai, et al.
Published: (2025)

Adaptive few-shot learning for robust part quality classification in two-photon lithography
by: Jia, Sixian, et al.
Published: (2026)

A training regime to learn unified representations from complementary breast imaging modalities
by: Sharma, Umang, et al.
Published: (2024)

Resolution-free neural surrogates for geometric parameterization and mapping with spatially varying fields
by: Huang, Yanwen, et al.
Published: (2026)

Learning-based density-equalizing map
by: Huang, Yanwen, et al.
Published: (2025)

Closing the gap in multimodal medical representation alignment
by: Grassucci, Eleonora, et al.
Published: (2026)

Explaining latent representations of generative models with large multimodal models
by: Zhu, Mengdan, et al.
Published: (2024)

A multimodal slice discovery framework for systematic failure detection and explanation in medical image classification
by: Liu, Yixuan, et al.
Published: (2026)

Universal representations:The missing link between faces, text, planktons, and cat breeds
by: Bilen, Hakan, et al.
Published: (2017)

AG-Fusion: adaptive gated multimodal fusion for 3d object detection in complex scenes
by: Liu, Sixian, et al.
Published: (2025)

Multi-level Cross-modal Alignment for Image Clustering
by: Qiu, Liping, et al.
Published: (2024)

Classification of freshwater snails of the genus Radomaniola with multimodal triplet networks
by: Vetter, Dennis, et al.
Published: (2024)

Open-world machine learning: A review and new outlooks
by: Zhu, Fei, et al.
Published: (2024)

CLIP Behaves like a Bag-of-Words Model Cross-modally but not Uni-modally
by: Koishigarina, Darina, et al.
Published: (2025)

Using deep learning to enhance electronic service quality: Application to real estate websites
by: Elnagar, Samaa
Published: (2024)

Robust Classification by Coupling Data Mollification with Label Smoothing
by: Heinonen, Markus, et al.
Published: (2024)

Origins of Creativity in Attention-Based Diffusion Models
by: Finn, Emma, et al.
Published: (2025)

Unsupervised Waste Classification By Dual-Encoder Contrastive Learning and Multi-Clustering Voting (DECMCV)
by: Huang, Kui, et al.
Published: (2025)

Mirage in the Eyes: Hallucination Attack on Multi-modal Large Language Models with Only Attention Sink
by: Wang, Yining, et al.
Published: (2025)

Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey
by: Cho, Seunghyuk, et al.
Published: (2025)

Diffusion Classifier Guidance for Non-robust Classifiers
by: Vaeth, Philipp, et al.
Published: (2025)

Is it the model or the metric -- On robustness measures of deeplearning models
by: Lyu, Zhijin, et al.
Published: (2024)

Improving robustness to corruptions with multiplicative weight perturbations
by: Trinh, Trung, et al.
Published: (2024)