Saved in:
| Main Authors: | Faghihi, Ehsan, Zarenejad, Mohammedreza, Shirazi, Ali-Asghar Beheshti |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.01975 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RetinaLogos: Fine-Grained Synthesis of High-Resolution Retinal Images Through Captions
by: Ning, Junzhi, et al.
Published: (2025)
by: Ning, Junzhi, et al.
Published: (2025)
Listening without Looking: Modality Bias in Audio-Visual Captioning
by: Ishikawa, Yuchi, et al.
Published: (2025)
by: Ishikawa, Yuchi, et al.
Published: (2025)
EMOVIS: Emotion-Optimized Image Processing
by: Barber, Dor, et al.
Published: (2026)
by: Barber, Dor, et al.
Published: (2026)
AI-Enhanced Virtual Biopsies for Brain Tumor Diagnosis in Low Resource Settings
by: Ehsan, Areeb
Published: (2025)
by: Ehsan, Areeb
Published: (2025)
Whitened CLIP as a Likelihood Surrogate of Images and Captions
by: Betser, Roy, et al.
Published: (2025)
by: Betser, Roy, et al.
Published: (2025)
The Solution for the CVPR2023 NICE Image Captioning Challenge
by: Wu, Xiangyu, et al.
Published: (2023)
by: Wu, Xiangyu, et al.
Published: (2023)
When Eye-Tracking Meets Machine Learning: A Systematic Review on Applications in Medical Image Analysis
by: Moradizeyveh, Sahar, et al.
Published: (2024)
by: Moradizeyveh, Sahar, et al.
Published: (2024)
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
by: Pourkeshavarz, Mozhgan, et al.
Published: (2023)
by: Pourkeshavarz, Mozhgan, et al.
Published: (2023)
SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting
by: Huang, Yiming, et al.
Published: (2025)
by: Huang, Yiming, et al.
Published: (2025)
Omnidirectional Image Quality Captioning: A Large-scale Database and A New Model
by: Yan, Jiebin, et al.
Published: (2025)
by: Yan, Jiebin, et al.
Published: (2025)
E-RGB-D: Real-Time Event-Based Perception with Structured Light
by: Bajestani, Seyed Ehsan Marjani, et al.
Published: (2025)
by: Bajestani, Seyed Ehsan Marjani, et al.
Published: (2025)
Unified Multi-Modal Image Synthesis for Missing Modality Imputation
by: Zhang, Yue, et al.
Published: (2023)
by: Zhang, Yue, et al.
Published: (2023)
Efficient motion-based metrics for video frame interpolation
by: Daly, Conall, et al.
Published: (2025)
by: Daly, Conall, et al.
Published: (2025)
A Survey on Semantic Communication for Vision: Categories, Frameworks, Enabling Techniques, and Applications
by: Cheng, Runze, et al.
Published: (2026)
by: Cheng, Runze, et al.
Published: (2026)
Detection and tracking of gas plumes in LWIR hyperspectral video sequence data
by: Gerhart, Torin, et al.
Published: (2024)
by: Gerhart, Torin, et al.
Published: (2024)
Automated extraction of 4D aircraft trajectories from video recordings
by: Villeforceix, Jean-François
Published: (2024)
by: Villeforceix, Jean-François
Published: (2024)
Understanding-informed Bias Mitigation for Fair CMR Segmentation
by: Lee, Tiarna, et al.
Published: (2025)
by: Lee, Tiarna, et al.
Published: (2025)
3R-INN: How to be climate friendly while consuming/delivering videos?
by: Ameur, Zoubida, et al.
Published: (2024)
by: Ameur, Zoubida, et al.
Published: (2024)
Ti-Patch: Tiled Physical Adversarial Patch for no-reference video quality metrics
by: Leonenkova, Victoria, et al.
Published: (2024)
by: Leonenkova, Victoria, et al.
Published: (2024)
Physics-Informed Latent Diffusion for Multimodal Brain MRI Synthesis
by: Lüpke, Sven, et al.
Published: (2024)
by: Lüpke, Sven, et al.
Published: (2024)
Multi-scale and Multi-path Cascaded Convolutional Network for Semantic Segmentation of Colorectal Polyps
by: Manan, Malik Abdul, et al.
Published: (2024)
by: Manan, Malik Abdul, et al.
Published: (2024)
Predicting total time to compress a video corpus using online inference systems
by: Shu, Xin, et al.
Published: (2024)
by: Shu, Xin, et al.
Published: (2024)
C3VDv2 -- Colonoscopy 3D video dataset with enhanced realism
by: Golhar, Mayank V., et al.
Published: (2025)
by: Golhar, Mayank V., et al.
Published: (2025)
Multi-modality transrectal ultrasound video classification for identification of clinically significant prostate cancer
by: Wu, Hong, et al.
Published: (2024)
by: Wu, Hong, et al.
Published: (2024)
BronchoGAN: Anatomically consistent and domain-agnostic image-to-image translation for video bronchoscopy
by: Soliman, Ahmad, et al.
Published: (2025)
by: Soliman, Ahmad, et al.
Published: (2025)
Robust Divergence Learning for Missing-Modality Segmentation
by: Cheng, Runze, et al.
Published: (2024)
by: Cheng, Runze, et al.
Published: (2024)
MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned Text Embedding and Alternating Training
by: Li, Chengyin, et al.
Published: (2024)
by: Li, Chengyin, et al.
Published: (2024)
Dual-Stream Cross-Modal Representation Learning via Residual Semantic Decorrelation
by: Li, Xuecheng, et al.
Published: (2025)
by: Li, Xuecheng, et al.
Published: (2025)
CAMP-VQA: Caption-Embedded Multimodal Perception for No-Reference Quality Assessment of Compressed Video
by: Wang, Xinyi, et al.
Published: (2025)
by: Wang, Xinyi, et al.
Published: (2025)
MRI to PET Cross-Modality Translation using Globally and Locally Aware GAN (GLA-GAN) for Multi-Modal Diagnosis of Alzheimer's Disease
by: Sikka, Apoorva, et al.
Published: (2021)
by: Sikka, Apoorva, et al.
Published: (2021)
Anomaly detection in non-stationary videos using time-recursive differencing network based prediction
by: Pillai, Gargi V., et al.
Published: (2025)
by: Pillai, Gargi V., et al.
Published: (2025)
CORSTITCH - A free, open source software for stitching and georeferencing underwater coral reef videos
by: Maypa, Julian Christopher L., et al.
Published: (2025)
by: Maypa, Julian Christopher L., et al.
Published: (2025)
Optimally Bridging Semantics and Data: Generative Semantic Communication via Schrödinger Bridge
by: Gao, Dahua, et al.
Published: (2026)
by: Gao, Dahua, et al.
Published: (2026)
Vision Transformer Based Semantic Communications for Next Generation Wireless Networks
by: Mohsin, Muhammad Ahmed, et al.
Published: (2025)
by: Mohsin, Muhammad Ahmed, et al.
Published: (2025)
Modality Exchange Network for Retinogeniculate Visual Pathway Segmentation
by: Han, Hua, et al.
Published: (2024)
by: Han, Hua, et al.
Published: (2024)
Universal Vessel Segmentation for Multi-Modality Retinal Images
by: Wen, Bo, et al.
Published: (2025)
by: Wen, Bo, et al.
Published: (2025)
Classification of All Blood Cell Images using ML and DL Models
by: Asghar, Rabia, et al.
Published: (2023)
by: Asghar, Rabia, et al.
Published: (2023)
Automatic Classification of White Blood Cell Images using Convolutional Neural Network
by: Asghar, Rabia, et al.
Published: (2024)
by: Asghar, Rabia, et al.
Published: (2024)
Higher fidelity perceptual image and video compression with a latent conditioned residual denoising diffusion model
by: Brenig, Jonas, et al.
Published: (2025)
by: Brenig, Jonas, et al.
Published: (2025)
Physical prior guided cooperative learning framework for joint turbulence degradation estimation and infrared video restoration
by: Zhang, Ziran, et al.
Published: (2024)
by: Zhang, Ziran, et al.
Published: (2024)
Similar Items
-
RetinaLogos: Fine-Grained Synthesis of High-Resolution Retinal Images Through Captions
by: Ning, Junzhi, et al.
Published: (2025) -
Listening without Looking: Modality Bias in Audio-Visual Captioning
by: Ishikawa, Yuchi, et al.
Published: (2025) -
EMOVIS: Emotion-Optimized Image Processing
by: Barber, Dor, et al.
Published: (2026) -
AI-Enhanced Virtual Biopsies for Brain Tumor Diagnosis in Low Resource Settings
by: Ehsan, Areeb
Published: (2025) -
Whitened CLIP as a Likelihood Surrogate of Images and Captions
by: Betser, Roy, et al.
Published: (2025)