Saved in:
| Main Authors: | Rachuri, Ravi Datta, Liao, Duoduo, Sarikonda, Samhita, Kondur, Datha Vaishnavi |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.17968 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multimodal Event Detection: Current Approaches and Defining the New Playground through LLMs and VLMs
by: Dey, Abhishek, et al.
Published: (2025)
by: Dey, Abhishek, et al.
Published: (2025)
Adaptive Signal Analysis for Automated Subsurface Defect Detection Using Impact Echo in Concrete Slabs
by: Pavurala, Deepthi, et al.
Published: (2024)
by: Pavurala, Deepthi, et al.
Published: (2024)
Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models
by: Rao, Abinav, et al.
Published: (2026)
by: Rao, Abinav, et al.
Published: (2026)
MambaFusion: Adaptive State-Space Fusion for Multimodal 3D Object Detection
by: Narayanan, Venkatraman, et al.
Published: (2026)
by: Narayanan, Venkatraman, et al.
Published: (2026)
Bridging Pixels and Words: Mask-Aware Local Semantic Fusion for Multimodal Media Verification
by: Chen, Zizhao, et al.
Published: (2026)
by: Chen, Zizhao, et al.
Published: (2026)
Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
by: Liu, Yifan, et al.
Published: (2025)
by: Liu, Yifan, et al.
Published: (2025)
Multispectral State-Space Feature Fusion: Bridging Shared and Cross-Parametric Interactions for Object Detection
by: Shen, Jifeng, et al.
Published: (2025)
by: Shen, Jifeng, et al.
Published: (2025)
BridgeNet: A Unified Multimodal Framework for Bridging 2D and 3D Industrial Anomaly Detection
by: Xiang, An, et al.
Published: (2025)
by: Xiang, An, et al.
Published: (2025)
Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings
by: Rose, Daniel, et al.
Published: (2023)
by: Rose, Daniel, et al.
Published: (2023)
RoadFusion: Latent Diffusion Model for Pavement Defect Detection
by: Aqeel, Muhammad, et al.
Published: (2025)
by: Aqeel, Muhammad, et al.
Published: (2025)
Cross-Modal Purification and Fusion for Small-Object RGB-D Transmission-Line Defect Detection
by: Cui, Jiaming, et al.
Published: (2026)
by: Cui, Jiaming, et al.
Published: (2026)
TransMatch: A Transfer-Learning Framework for Defect Detection in Laser Powder Bed Fusion Additive Manufacturing
by: Ilani, Mohsen Asghari, et al.
Published: (2025)
by: Ilani, Mohsen Asghari, et al.
Published: (2025)
Saliency-Guided Deep Learning for Bridge Defect Detection in Drone Imagery
by: Hebbache, Loucif, et al.
Published: (2025)
by: Hebbache, Loucif, et al.
Published: (2025)
VaLID: Verification as Late Integration of Detections for LiDAR-Camera Fusion
by: Vats, Vanshika, et al.
Published: (2024)
by: Vats, Vanshika, et al.
Published: (2024)
COMO: Cross-Mamba Interaction and Offset-Guided Fusion for Multimodal Object Detection
by: Liu, Chang, et al.
Published: (2024)
by: Liu, Chang, et al.
Published: (2024)
PalmBridge: A Plug-and-Play Feature Alignment Framework for Open-Set Palmprint Verification
by: Zhang, Chenke, et al.
Published: (2026)
by: Zhang, Chenke, et al.
Published: (2026)
Diffusion Image Generation with Explicit Modeling of Data Manifold Geometry
by: Xue, Duoduo, et al.
Published: (2026)
by: Xue, Duoduo, et al.
Published: (2026)
S^2F-Net:A Robust Spatial-Spectral Fusion Framework for Cross-Model AIGC Detection
by: Hu, Xiangyu, et al.
Published: (2026)
by: Hu, Xiangyu, et al.
Published: (2026)
IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection
by: Yin, Junbo, et al.
Published: (2024)
by: Yin, Junbo, et al.
Published: (2024)
LightFusion: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation
by: Wang, Zeyu, et al.
Published: (2025)
by: Wang, Zeyu, et al.
Published: (2025)
Multimodal Diffusion Bridge with Attention-Based SAR Fusion for Satellite Image Cloud Removal
by: Hu, Yuyang, et al.
Published: (2025)
by: Hu, Yuyang, et al.
Published: (2025)
GateFusion: Hierarchical Gated Cross-Modal Fusion for Active Speaker Detection
by: Wang, Yu, et al.
Published: (2025)
by: Wang, Yu, et al.
Published: (2025)
Task-Generalized Adaptive Cross-Domain Learning for Multimodal Image Fusion
by: Wang, Mengyu, et al.
Published: (2025)
by: Wang, Mengyu, et al.
Published: (2025)
Can Reasons Help Improve Pedestrian Intent Estimation? A Cross-Modal Approach
by: Khindkar, Vaishnavi, et al.
Published: (2024)
by: Khindkar, Vaishnavi, et al.
Published: (2024)
Resilient Multimodal Industrial Surface Defect Detection with Uncertain Sensors Availability
by: Jiang, Shuai, et al.
Published: (2025)
by: Jiang, Shuai, et al.
Published: (2025)
VMID: A Multimodal Fusion LLM Framework for Detecting and Identifying Misinformation of Short Videos
by: Zhong, Weihao, et al.
Published: (2024)
by: Zhong, Weihao, et al.
Published: (2024)
V-Loop: Visual Logical Loop Verification for Hallucination Detection in Medical Visual Question Answering
by: Jin, Mengyuan, et al.
Published: (2026)
by: Jin, Mengyuan, et al.
Published: (2026)
CAD: A General Multimodal Framework for Video Deepfake Detection via Cross-Modal Alignment and Distillation
by: Du, Yuxuan, et al.
Published: (2025)
by: Du, Yuxuan, et al.
Published: (2025)
GatedCLIP: Gated Multimodal Fusion for Hateful Memes Detection
by: Guo, Yingying, et al.
Published: (2026)
by: Guo, Yingying, et al.
Published: (2026)
Contour-Native Bridge Defect Detection and Compact Digital Archiving with Frequency-Supervised Fourier Contours
by: Liu, Jin, et al.
Published: (2026)
by: Liu, Jin, et al.
Published: (2026)
InfiFusion: A Unified Framework for Enhanced Cross-Model Reasoning via LLM Fusion
by: Yan, Zhaoyi, et al.
Published: (2025)
by: Yan, Zhaoyi, et al.
Published: (2025)
Pyramidal Adaptive Cross-Gating for Multimodal Detection
by: Gu, Zidong, et al.
Published: (2025)
by: Gu, Zidong, et al.
Published: (2025)
Feature Perturbation Pool-based Fusion Network for Unified Multi-Class Industrial Defect Detection
by: Xu, Yuanchan, et al.
Published: (2026)
by: Xu, Yuanchan, et al.
Published: (2026)
RCTDistill: Cross-Modal Knowledge Distillation Framework for Radar-Camera 3D Object Detection with Temporal Fusion
by: Bang, Geonho, et al.
Published: (2025)
by: Bang, Geonho, et al.
Published: (2025)
UniPCB: A Generation-Assisted Detection Framework for PCB Defect Inspection
by: Zhang, Huan, et al.
Published: (2026)
by: Zhang, Huan, et al.
Published: (2026)
Medical Report Generation: A Hierarchical Task Structure-Based Cross-Modal Causal Intervention Framework
by: Song, Yucheng, et al.
Published: (2025)
by: Song, Yucheng, et al.
Published: (2025)
CSFMamba: Cross State Fusion Mamba Operator for Multimodal Remote Sensing Image Classification
by: Wang, Qingyu, et al.
Published: (2025)
by: Wang, Qingyu, et al.
Published: (2025)
AMFD: Distillation via Adaptive Multimodal Fusion for Multispectral Pedestrian Detection
by: Chen, Zizhao, et al.
Published: (2024)
by: Chen, Zizhao, et al.
Published: (2024)
Graph-Based Uncertainty Modeling and Multimodal Fusion for Salient Object Detection
by: Xiong, Yuqi, et al.
Published: (2025)
by: Xiong, Yuqi, et al.
Published: (2025)
Uncertainty-Weighted Image-Event Multimodal Fusion for Video Anomaly Detection
by: Jeong, Sungheon, et al.
Published: (2025)
by: Jeong, Sungheon, et al.
Published: (2025)
Similar Items
-
Multimodal Event Detection: Current Approaches and Defining the New Playground through LLMs and VLMs
by: Dey, Abhishek, et al.
Published: (2025) -
Adaptive Signal Analysis for Automated Subsurface Defect Detection Using Impact Echo in Concrete Slabs
by: Pavurala, Deepthi, et al.
Published: (2024) -
Do Understanding and Generation Fight? A Diagnostic Study of DPO for Unified Multimodal Models
by: Rao, Abinav, et al.
Published: (2026) -
MambaFusion: Adaptive State-Space Fusion for Multimodal 3D Object Detection
by: Narayanan, Venkatraman, et al.
Published: (2026) -
Bridging Pixels and Words: Mask-Aware Local Semantic Fusion for Multimodal Media Verification
by: Chen, Zizhao, et al.
Published: (2026)