:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Jingyu, Wang, Yang
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2503.05626
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Astrea: A MOE-based Visual Understanding Model with Progressive Alignment
by: Yang, Xiaoda, et al.
Published: (2025)

NESTOR: A Nested MOE-based Neural Operator for Large-Scale PDE Pre-Training
by: Sun, Dengdi, et al.
Published: (2026)

Research on Splicing Image Detection Algorithms Based on Natural Image Statistical Characteristics
by: Xiang, Ao, et al.
Published: (2024)

Research on Edge Detection of LiDAR Images Based on Artificial Intelligence Technology
by: Yang, Haowei, et al.
Published: (2024)

Research on Detection of Floating Objects in River and Lake Based on AI Intelligent Image Recognition
by: Zhang, Jingyu, et al.
Published: (2024)

MultiSense-Pneumo: A Multimodal Learning Framework for Pneumonia Screening in Resource-Constrained Settings
by: Jayakody, Dineth, et al.
Published: (2026)

Explainable Deep Learning in Medical Imaging: Brain Tumor and Pneumonia Detection
by: Erukude, Sai Teja, et al.
Published: (2025)

LoMOE: Localized Multi-Object Editing via Multi-Diffusion
by: Chakrabarty, Goirik, et al.
Published: (2024)

Explainable Deep Learning for Pediatric Pneumonia Detection in Chest X-Ray Images
by: Khadidos, Adil O., et al.
Published: (2026)

VMID: A Multimodal Fusion LLM Framework for Detecting and Identifying Misinformation of Short Videos
by: Zhong, Weihao, et al.
Published: (2024)

Breast Cancer Detection in Thermographic Images via Diffusion-Based Augmentation and Nonlinear Feature Fusion
by: Salem, Sepehr, et al.
Published: (2025)

Pediatric Pneumonia Detection from Chest X-Rays:A Comparative Study of Transfer Learning and Custom CNNs
by: Choudhury, Agniv Roy
Published: (2025)

Feature Recalibration Based Olfactory-Visual Multimodal Model for Enhanced Rice Deterioration Detection
by: Zhao, Rongqiang, et al.
Published: (2026)

A Simple Aerial Detection Baseline of Multimodal Language Models
by: Li, Qingyun, et al.
Published: (2025)

JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation Framework
by: Liu, Ziyuan, et al.
Published: (2025)

Transformer-Based Framework for Motion Capture Denoising and Anomaly Detection in Medical Rehabilitation
by: Cai, Yeming, et al.
Published: (2025)

A Dual Cross-Attention Graph Learning Framework For Multimodal MRI-Based Major Depressive Disorder Detection
by: Alotaibi, Nojod M., et al.
Published: (2026)

Deep Learning-Based Computer Vision Models for Early Cancer Detection Using Multimodal Medical Imaging and Radiogenomic Integration Frameworks
by: Oghenekaro, Emmanuella Avwerosuoghene
Published: (2025)

DBMF: A Dual-Branch Multimodal Framework for Out-of-Distribution Detection
by: Yue, Jiangbei, et al.
Published: (2026)

M-Gaussian: An Magnetic Gaussian Framework for Efficient Multi-Stack MRI Reconstruction
by: Zheng, Kangyuan, et al.
Published: (2026)

Leveraging Chat-Based Large Vision Language Models for Multimodal Out-Of-Context Detection
by: Shalabi, Fatma, et al.
Published: (2024)

Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model
by: Xu, Wanting, et al.
Published: (2024)

Beyond Face Swapping: A Diffusion-Based Digital Human Benchmark for Multimodal Deepfake Detection
by: Liu, Jiaxin, et al.
Published: (2025)

Lightweight Weighted Average Ensemble Model for Pneumonia Detection in Chest X-Ray Images
by: Nettur, Suresh Babu, et al.
Published: (2025)

Pilot: Building the Federated Multimodal Instruction Tuning Framework
by: Xiong, Baochen, et al.
Published: (2025)

UniCMs: A Unified Consistency Model For Efficient Multimodal Generation and Understanding
by: Xu, Chenkai, et al.
Published: (2025)

A Context-aware Attention and Graph Neural Network-based Multimodal Framework for Misogyny Detection
by: Rehman, Mohammad Zia Ur, et al.
Published: (2025)

Loupe: A Generalizable and Adaptive Framework for Image Forgery Detection
by: Jiang, Yuchu, et al.
Published: (2025)

MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection
by: Jiang, Xi, et al.
Published: (2024)

MEGL: Multimodal Explanation-Guided Learning
by: Zhang, Yifei, et al.
Published: (2024)

Complementary Pseudo Multimodal Feature for Point Cloud Anomaly Detection
by: Cao, Yunkang, et al.
Published: (2023)

MM-NeuroOnco: A Multimodal Benchmark and Instruction Dataset for MRI-Based Brain Tumor Diagnosis
by: Guo, Feng, et al.
Published: (2026)

Contextual Object Detection with Multimodal Large Language Models
by: Zang, Yuhang, et al.
Published: (2023)

Spatio-Temporal Token Pruning for Efficient High-Resolution GUI Agents
by: Xu, Zhou, et al.
Published: (2026)

On the Out-Of-Distribution Generalization of Multimodal Large Language Models
by: Zhang, Xingxuan, et al.
Published: (2024)

Quanvolutional Neural Networks for Pneumonia Detection: An Efficient Quantum-Assisted Feature Extraction Paradigm
by: Tanbhir, Gazi, et al.
Published: (2025)

Discrete Diffusion Models with MLLMs for Unified Medical Multimodal Generation
by: Mao, Jiawei, et al.
Published: (2025)

YOLO-PPA based Efficient Traffic Sign Detection for Cruise Control in Autonomous Driving
by: Zhang, Jingyu, et al.
Published: (2024)

HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation
by: Raza, Shaina, et al.
Published: (2025)

When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning
by: Wu, Zhengxian, et al.
Published: (2026)