:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Qingyang, Wei, Yake, Han, Zongbo, Fu, Huazhu, Peng, Xi, Deng, Cheng, Hu, Qinghua, Xu, Cai, Wen, Jie, Hu, Di, Zhang, Changqing
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2404.18947
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Out-Of-Distribution Detection with Diversification (Provably)
by: Yao, Haiyun, et al.
Published: (2024)

MULTIBENCH++: A Unified and Comprehensive Multimodal Fusion Benchmarking Across Specialized Domains
by: Xue, Leyan, et al.
Published: (2025)

Helping CLIP See Both the Forest and the Trees: A Decomposition and Description Approach
by: Xue, Leyan, et al.
Published: (2025)

ID-like Prompt Learning for Few-Shot Out-of-Distribution Detection
by: Bai, Yichen, et al.
Published: (2023)

MMPareto: Boosting Multimodal Learning with Innocent Unimodal Assistance
by: Wei, Yake, et al.
Published: (2024)

Selective Learning: Towards Robust Calibration with Dynamic Regularization
by: Han, Zongbo, et al.
Published: (2024)

Dig2DIG: Dig into Diffusion Information Gains for Image Fusion
by: Cao, Bing, et al.
Published: (2025)

MokA: Multimodal Low-Rank Adaptation for MLLMs
by: Wei, Yake, et al.
Published: (2025)

On-the-fly Modulation for Balanced Multimodal Learning
by: Wei, Yake, et al.
Published: (2024)

Retrieval-Augmented Prompt for OOD Detection
by: Han, Ruisong, et al.
Published: (2025)

Confidence-aware multi-modality learning for eye disease screening
by: Zou, Ke, et al.
Published: (2024)

Predictive Dynamic Fusion
by: Cao, Bing, et al.
Published: (2024)

Diagnosing and Re-learning for Balanced Multimodal Learning
by: Wei, Yake, et al.
Published: (2024)

Test-Time Dynamic Image Fusion
by: Cao, Bing, et al.
Published: (2024)

Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition
by: Huang, Chengxiang, et al.
Published: (2025)

The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection
by: Zhang, Qingyang, et al.
Published: (2024)

Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception
by: Peng, Ruotian, et al.
Published: (2025)

Adapting Large Language Models for Content Moderation: Pitfalls in Data Engineering and Supervised Fine-tuning
by: Ma, Huan, et al.
Published: (2023)

RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer
by: Ni, Haotian, et al.
Published: (2025)

Computational Reasoning of Large Language Models
by: Wu, Haitao, et al.
Published: (2025)

Generalized Few-Shot Out-of-Distribution Detection
by: Li, Pinxuan, et al.
Published: (2025)

Prompt Guiding Multi-Scale Adaptive Sparse Representation-driven Network for Low-Dose CT MAR
by: Shi, Baoshun, et al.
Published: (2025)

Enhancing Modality Representation and Alignment for Multimodal Cold-start Active Learning
by: Shen, Meng, et al.
Published: (2024)

TEMPO: Scaling Test-time Training for Large Reasoning Models
by: Zhang, Qingyang, et al.
Published: (2026)

Quantifying and Enhancing Multi-modal Robustness with Modality Preference
by: Yang, Zequn, et al.
Published: (2024)

Enhancing multimodal cooperation via sample-level modality valuation
by: Wei, Yake, et al.
Published: (2023)

Dynamic Characteristics and Lateral‐Torsional Vibration Response of SRC Frame With Special‐Shaped Columns
by: Zongbo Hu, et al.
Published: (2025)

Hallucination of Multimodal Large Language Models: A Survey
by: Bai, Zechen, et al.
Published: (2024)

AdaFusion: Prompt-Guided Inference with Adaptive Fusion of Pathology Foundation Models
by: Xiao, Yuxiang, et al.
Published: (2025)

Content Generation Models in Computational Pathology: A Comprehensive Survey on Methods, Applications, and Challenges
by: Zhang, Yuan, et al.
Published: (2025)

Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models
by: Han, Zongbo, et al.
Published: (2024)

MIBench: Evaluating LMMs on Multimodal Interaction
by: Miao, Yu, et al.
Published: (2026)

DOTA: Distributional Test-Time Adaptation of Vision-Language Models
by: Han, Zongbo, et al.
Published: (2024)

Empowering Multimodal LLMs with External Tools: A Comprehensive Survey
by: An, Wenbin, et al.
Published: (2025)

Addressing Missing and Noisy Modalities in One Solution: Unified Modality-Quality Framework for Low-quality Multimodal Data
by: Mai, Sijie, et al.
Published: (2026)

Quantum Visual Word Sense Disambiguation: Unraveling Ambiguities Through Quantum Inference Model
by: Qiao, Wenbo, et al.
Published: (2025)

Right Question is Already Half the Answer: Fully Unsupervised LLM Reasoning Incentivization
by: Zhang, Qingyang, et al.
Published: (2025)

COME: Test-time adaption by Conservatively Minimizing Entropy
by: Zhang, Qingyang, et al.
Published: (2024)

Representation Learning for Tabular Data: A Comprehensive Survey
by: Jiang, Jun-Peng, et al.
Published: (2025)

A Visual Inertia‐Inspired Multimode Sensor Based on Pb–S Strongly Coupled Heterostructures for Information Fusion Positioning and Monitoring
by: Leping Li, et al.
Published: (2026)