:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Faghihi, Ehsan, Zarenejad, Mohammedreza, Shirazi, Ali-Asghar Beheshti
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Image and Video Processing
Online Access:	https://arxiv.org/abs/2411.01975
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

RetinaLogos: Fine-Grained Synthesis of High-Resolution Retinal Images Through Captions
by: Ning, Junzhi, et al.
Published: (2025)

Listening without Looking: Modality Bias in Audio-Visual Captioning
by: Ishikawa, Yuchi, et al.
Published: (2025)

EMOVIS: Emotion-Optimized Image Processing
by: Barber, Dor, et al.
Published: (2026)

AI-Enhanced Virtual Biopsies for Brain Tumor Diagnosis in Low Resource Settings
by: Ehsan, Areeb
Published: (2025)

Whitened CLIP as a Likelihood Surrogate of Images and Captions
by: Betser, Roy, et al.
Published: (2025)

The Solution for the CVPR2023 NICE Image Captioning Challenge
by: Wu, Xiangyu, et al.
Published: (2023)

When Eye-Tracking Meets Machine Learning: A Systematic Review on Applications in Medical Image Analysis
by: Moradizeyveh, Sahar, et al.
Published: (2024)

Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
by: Pourkeshavarz, Mozhgan, et al.
Published: (2023)

SurgTPGS: Semantic 3D Surgical Scene Understanding with Text Promptable Gaussian Splatting
by: Huang, Yiming, et al.
Published: (2025)

Omnidirectional Image Quality Captioning: A Large-scale Database and A New Model
by: Yan, Jiebin, et al.
Published: (2025)

E-RGB-D: Real-Time Event-Based Perception with Structured Light
by: Bajestani, Seyed Ehsan Marjani, et al.
Published: (2025)

Unified Multi-Modal Image Synthesis for Missing Modality Imputation
by: Zhang, Yue, et al.
Published: (2023)

Efficient motion-based metrics for video frame interpolation
by: Daly, Conall, et al.
Published: (2025)

A Survey on Semantic Communication for Vision: Categories, Frameworks, Enabling Techniques, and Applications
by: Cheng, Runze, et al.
Published: (2026)

Detection and tracking of gas plumes in LWIR hyperspectral video sequence data
by: Gerhart, Torin, et al.
Published: (2024)

Automated extraction of 4D aircraft trajectories from video recordings
by: Villeforceix, Jean-François
Published: (2024)

Understanding-informed Bias Mitigation for Fair CMR Segmentation
by: Lee, Tiarna, et al.
Published: (2025)

3R-INN: How to be climate friendly while consuming/delivering videos?
by: Ameur, Zoubida, et al.
Published: (2024)

Ti-Patch: Tiled Physical Adversarial Patch for no-reference video quality metrics
by: Leonenkova, Victoria, et al.
Published: (2024)

Physics-Informed Latent Diffusion for Multimodal Brain MRI Synthesis
by: Lüpke, Sven, et al.
Published: (2024)

Multi-scale and Multi-path Cascaded Convolutional Network for Semantic Segmentation of Colorectal Polyps
by: Manan, Malik Abdul, et al.
Published: (2024)

Predicting total time to compress a video corpus using online inference systems
by: Shu, Xin, et al.
Published: (2024)

C3VDv2 -- Colonoscopy 3D video dataset with enhanced realism
by: Golhar, Mayank V., et al.
Published: (2025)

Multi-modality transrectal ultrasound video classification for identification of clinically significant prostate cancer
by: Wu, Hong, et al.
Published: (2024)

BronchoGAN: Anatomically consistent and domain-agnostic image-to-image translation for video bronchoscopy
by: Soliman, Ahmad, et al.
Published: (2025)

Robust Divergence Learning for Missing-Modality Segmentation
by: Cheng, Runze, et al.
Published: (2024)

MulModSeg: Enhancing Unpaired Multi-Modal Medical Image Segmentation with Modality-Conditioned Text Embedding and Alternating Training
by: Li, Chengyin, et al.
Published: (2024)

Dual-Stream Cross-Modal Representation Learning via Residual Semantic Decorrelation
by: Li, Xuecheng, et al.
Published: (2025)

CAMP-VQA: Caption-Embedded Multimodal Perception for No-Reference Quality Assessment of Compressed Video
by: Wang, Xinyi, et al.
Published: (2025)

MRI to PET Cross-Modality Translation using Globally and Locally Aware GAN (GLA-GAN) for Multi-Modal Diagnosis of Alzheimer's Disease
by: Sikka, Apoorva, et al.
Published: (2021)

Anomaly detection in non-stationary videos using time-recursive differencing network based prediction
by: Pillai, Gargi V., et al.
Published: (2025)

CORSTITCH - A free, open source software for stitching and georeferencing underwater coral reef videos
by: Maypa, Julian Christopher L., et al.
Published: (2025)

Optimally Bridging Semantics and Data: Generative Semantic Communication via Schrödinger Bridge
by: Gao, Dahua, et al.
Published: (2026)

Vision Transformer Based Semantic Communications for Next Generation Wireless Networks
by: Mohsin, Muhammad Ahmed, et al.
Published: (2025)

Modality Exchange Network for Retinogeniculate Visual Pathway Segmentation
by: Han, Hua, et al.
Published: (2024)

Universal Vessel Segmentation for Multi-Modality Retinal Images
by: Wen, Bo, et al.
Published: (2025)

Classification of All Blood Cell Images using ML and DL Models
by: Asghar, Rabia, et al.
Published: (2023)

Automatic Classification of White Blood Cell Images using Convolutional Neural Network
by: Asghar, Rabia, et al.
Published: (2024)

Higher fidelity perceptual image and video compression with a latent conditioned residual denoising diffusion model
by: Brenig, Jonas, et al.
Published: (2025)

Physical prior guided cooperative learning framework for joint turbulence degradation estimation and infrared video restoration
by: Zhang, Ziran, et al.
Published: (2024)