Saved in:
| Main Authors: | Alwazzan, Omnia, Patras, Ioannis, Slabaugh, Gregory |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.06339 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MOAB: Multi-Modal Outer Arithmetic Block For Fusion Of Histopathological Images And Genetic Data For Brain Tumor Grading
by: Alwazzan, Omnia, et al.
Published: (2024)
by: Alwazzan, Omnia, et al.
Published: (2024)
Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology
by: Alwazzan, Omnia, et al.
Published: (2024)
by: Alwazzan, Omnia, et al.
Published: (2024)
FashionSD-X: Multimodal Fashion Garment Synthesis using Latent Diffusion
by: Singh, Abhishek Kumar, et al.
Published: (2024)
by: Singh, Abhishek Kumar, et al.
Published: (2024)
BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology
by: Gallagher-Syed, Amaya, et al.
Published: (2025)
by: Gallagher-Syed, Amaya, et al.
Published: (2025)
Self-Supervised Facial Representation Learning with Facial Region Awareness
by: Gao, Zheng, et al.
Published: (2024)
by: Gao, Zheng, et al.
Published: (2024)
Prompting Visual-Language Models for Dynamic Facial Expression Recognition
by: Zhao, Zengqun, et al.
Published: (2023)
by: Zhao, Zengqun, et al.
Published: (2023)
Efficient Unsupervised Visual Representation Learning with Explicit Cluster Balancing
by: Metaxas, Ioannis Maniadis, et al.
Published: (2024)
by: Metaxas, Ioannis Maniadis, et al.
Published: (2024)
CAMS: Convolution and Attention-Free Mamba-based Cardiac Image Segmentation
by: Khan, Abbas, et al.
Published: (2024)
by: Khan, Abbas, et al.
Published: (2024)
CLIPCleaner: Cleaning Noisy Labels with CLIP
by: Feng, Chen, et al.
Published: (2024)
by: Feng, Chen, et al.
Published: (2024)
EmoCLIP: A Vision-Language Method for Zero-Shot Video Facial Expression Recognition
by: Foteinopoulou, Niki Maria, et al.
Published: (2023)
by: Foteinopoulou, Niki Maria, et al.
Published: (2023)
FairCoT: Enhancing Fairness in Text-to-Image Generation via Chain of Thought Reasoning with Multimodal Large Language Models
by: Sahili, Zahraa Al, et al.
Published: (2024)
by: Sahili, Zahraa Al, et al.
Published: (2024)
CoLoRSMamba: Conditional LoRA-Steered Mamba for Supervised Multimodal Violence Detection
by: Senadeera, Damith Chamalke, et al.
Published: (2026)
by: Senadeera, Damith Chamalke, et al.
Published: (2026)
SSR: An Efficient and Robust Framework for Learning with Unknown Label Noise
by: Feng, Chen, et al.
Published: (2021)
by: Feng, Chen, et al.
Published: (2021)
RAVE: Residual Vector Embedding for CLIP-Guided Backlit Image Enhancement
by: Gaintseva, Tatiana, et al.
Published: (2024)
by: Gaintseva, Tatiana, et al.
Published: (2024)
CemiFace: Center-based Semi-hard Synthetic Face Generation for Face Recognition
by: Sun, Zhonglin, et al.
Published: (2024)
by: Sun, Zhonglin, et al.
Published: (2024)
Are CLIP features all you need for Universal Synthetic Image Origin Attribution?
by: Cioni, Dario, et al.
Published: (2024)
by: Cioni, Dario, et al.
Published: (2024)
Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer
by: Zhao, Zengqun, et al.
Published: (2024)
by: Zhao, Zengqun, et al.
Published: (2024)
MM2Latent: Text-to-facial image generation and editing in GANs with multimodal assistance
by: Meng, Debin, et al.
Published: (2024)
by: Meng, Debin, et al.
Published: (2024)
Behaviour4All: in-the-wild Facial Behaviour Analysis Toolkit
by: Kollias, Dimitrios, et al.
Published: (2024)
by: Kollias, Dimitrios, et al.
Published: (2024)
Temporal Score Analysis for Understanding and Correcting Diffusion Artifacts
by: Cao, Yu, et al.
Published: (2025)
by: Cao, Yu, et al.
Published: (2025)
SuperCap: Multi-resolution Superpixel-based Image Captioning
by: Senior, Henry, et al.
Published: (2025)
by: Senior, Henry, et al.
Published: (2025)
Aligned Unsupervised Pretraining of Object Detectors with Self-training
by: Metaxas, Ioannis Maniadis, et al.
Published: (2023)
by: Metaxas, Ioannis Maniadis, et al.
Published: (2023)
FairJudge: Abstention-Aware Multimodal Judges for Fairness and Alignment Evaluation in Text-to-Image Models
by: Sahili, Zahraa Al, et al.
Published: (2025)
by: Sahili, Zahraa Al, et al.
Published: (2025)
VidCtx: Context-aware Video Question Answering with Image Models
by: Goulas, Andreas, et al.
Published: (2024)
by: Goulas, Andreas, et al.
Published: (2024)
P-TAME: Explain Any Image Classifier with Trained Perturbations
by: Ntrougkas, Mariano V., et al.
Published: (2025)
by: Ntrougkas, Mariano V., et al.
Published: (2025)
CUE-Net: Violence Detection Video Analytics with Spatial Cropping, Enhanced UniformerV2 and Modified Efficient Additive Attention
by: Senadeera, Damith Chamalke, et al.
Published: (2024)
by: Senadeera, Damith Chamalke, et al.
Published: (2024)
A low complexity contextual stacked ensemble-learning approach for pedestrian intent prediction
by: Chiang, Chia-Yen, et al.
Published: (2024)
by: Chiang, Chia-Yen, et al.
Published: (2024)
Flatten: Video Action Recognition is an Image Classification task
by: Chen, Junlin, et al.
Published: (2024)
by: Chen, Junlin, et al.
Published: (2024)
UAM: A Unified Attention-Mamba Backbone of Multimodal Framework for Tumor Cell Classification
by: Chen, Taixi, et al.
Published: (2025)
by: Chen, Taixi, et al.
Published: (2025)
One-shot Neural Face Reenactment via Finding Directions in GAN's Latent Space
by: Bounareli, Stella, et al.
Published: (2024)
by: Bounareli, Stella, et al.
Published: (2024)
LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition
by: Sun, Zhonglin, et al.
Published: (2024)
by: Sun, Zhonglin, et al.
Published: (2024)
XFMamba: Cross-Fusion Mamba for Multi-View Medical Image Classification
by: Zheng, Xiaoyu, et al.
Published: (2025)
by: Zheng, Xiaoyu, et al.
Published: (2025)
RoGUENeRF: A Robust Geometry-Consistent Universal Enhancer for NeRF
by: Catley-Chandar, Sibi, et al.
Published: (2024)
by: Catley-Chandar, Sibi, et al.
Published: (2024)
STaR: Seamless Spatial-Temporal Aware Motion Retargeting with Penetration and Consistency Constraints
by: Yang, Xiaohang, et al.
Published: (2025)
by: Yang, Xiaohang, et al.
Published: (2025)
FlattenGPT: Depth Compression for Transformer with Layer Flattening
by: Xu, Ruihan, et al.
Published: (2026)
by: Xu, Ruihan, et al.
Published: (2026)
Graph Neural Networks in Vision-Language Image Understanding: A Survey
by: Senior, Henry, et al.
Published: (2023)
by: Senior, Henry, et al.
Published: (2023)
Rethinking the Zigzag Flattening for Image Reading
by: Zhao, Qingsong, et al.
Published: (2022)
by: Zhao, Qingsong, et al.
Published: (2022)
Relax Forcing: Relaxed KV-Memory for Consistent Long Video Generation
by: Zhao, Zengqun, et al.
Published: (2026)
by: Zhao, Zengqun, et al.
Published: (2026)
CycleCap: Improving VLMs Captioning Performance via Self-Supervised Cycle Consistency Fine-Tuning
by: Krestenitis, Marios, et al.
Published: (2026)
by: Krestenitis, Marios, et al.
Published: (2026)
Adaptive Multi-Modal Control of Digital Human Hand Synthesis Using a Region-Aware Cycle Loss
by: Fu, Qifan, et al.
Published: (2024)
by: Fu, Qifan, et al.
Published: (2024)
Similar Items
-
MOAB: Multi-Modal Outer Arithmetic Block For Fusion Of Histopathological Images And Genetic Data For Brain Tumor Grading
by: Alwazzan, Omnia, et al.
Published: (2024) -
Multimodal Outer Arithmetic Block Dual Fusion of Whole Slide Images and Omics Data for Precision Oncology
by: Alwazzan, Omnia, et al.
Published: (2024) -
FashionSD-X: Multimodal Fashion Garment Synthesis using Latent Diffusion
by: Singh, Abhishek Kumar, et al.
Published: (2024) -
BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology
by: Gallagher-Syed, Amaya, et al.
Published: (2025) -
Self-Supervised Facial Representation Learning with Facial Region Awareness
by: Gao, Zheng, et al.
Published: (2024)