Saved in:
| Main Authors: | Gani, Hanan, Saadi, Nada, Hussein, Noor, Nandakumar, Karthik |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.08070 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models
by: Imam, Raza, et al.
Published: (2024)
by: Imam, Raza, et al.
Published: (2024)
PEMMA: Parameter-Efficient Multi-Modal Adaptation for Medical Image Segmentation
by: Saadi, Nada, et al.
Published: (2024)
by: Saadi, Nada, et al.
Published: (2024)
PromptSmooth: Certifying Robustness of Medical Vision-Language Models via Prompt Learning
by: Hussein, Noor, et al.
Published: (2024)
by: Hussein, Noor, et al.
Published: (2024)
Efficient Parameter Adaptation for Multi-Modal Medical Image Segmentation and Prognosis
by: Saeed, Numan, et al.
Published: (2025)
by: Saeed, Numan, et al.
Published: (2025)
Intra-finger Variability of Diffusion-based Latent Fingerprint Generation
by: Hussein, Noor, et al.
Published: (2026)
by: Hussein, Noor, et al.
Published: (2026)
MOLM: Mixture of LoRA Markers
by: Fares, Samar, et al.
Published: (2025)
by: Fares, Samar, et al.
Published: (2025)
First-Place Solution to NeurIPS 2024 Invisible Watermark Removal Challenge
by: Shamshad, Fahad, et al.
Published: (2025)
by: Shamshad, Fahad, et al.
Published: (2025)
Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization
by: Hassan, Jameel, et al.
Published: (2023)
by: Hassan, Jameel, et al.
Published: (2023)
Probing the Efficacy of Federated Parameter-Efficient Fine-Tuning of Vision Transformers for Medical Image Classification
by: Alkhunaizi, Naif, et al.
Published: (2024)
by: Alkhunaizi, Naif, et al.
Published: (2024)
SPQR: A Standardized Benchmark for Modern Safety Alignment Methods in Text-to-Image Diffusion Models
by: Alam, Mohammed Talha, et al.
Published: (2025)
by: Alam, Mohammed Talha, et al.
Published: (2025)
MedContext: Learning Contextual Cues for Efficient Volumetric Medical Segmentation
by: Gani, Hanan, et al.
Published: (2024)
by: Gani, Hanan, et al.
Published: (2024)
Robust-LLaVA: On the Effectiveness of Large-Scale Robust Image Encoders for Multi-modal Large Language Models
by: Malik, Hashmat Shadab, et al.
Published: (2025)
by: Malik, Hashmat Shadab, et al.
Published: (2025)
Shuffle Vision Transformer: Lightweight, Fast and Efficient Recognition of Driver Facial Expression
by: Saadi, Ibtissam, et al.
Published: (2024)
by: Saadi, Ibtissam, et al.
Published: (2024)
SPDMark: Selective Parameter Displacement for Robust Video Watermarking
by: Fares, Samar, et al.
Published: (2025)
by: Fares, Samar, et al.
Published: (2025)
VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs
by: Bharadwaj, Rohit, et al.
Published: (2024)
by: Bharadwaj, Rohit, et al.
Published: (2024)
LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts
by: Gani, Hanan, et al.
Published: (2023)
by: Gani, Hanan, et al.
Published: (2023)
Self-Supervised Vision Transformers Are Efficient Segmentation Learners for Imperfect Labels
by: Lee, Seungho, et al.
Published: (2024)
by: Lee, Seungho, et al.
Published: (2024)
Calibration-Aware Prompt Learning for Medical Vision-Language Models
by: Basu, Abhishek, et al.
Published: (2025)
by: Basu, Abhishek, et al.
Published: (2025)
Vision Transformers are Circulant Attention Learners
by: Han, Dongchen, et al.
Published: (2025)
by: Han, Dongchen, et al.
Published: (2025)
SimLVSeg: Simplifying Left Ventricular Segmentation in 2D+Time Echocardiograms with Self- and Weakly-Supervised Learning
by: Maani, Fadillah, et al.
Published: (2023)
by: Maani, Fadillah, et al.
Published: (2023)
RAVEN: Erasing Invisible Watermarks via Novel View Synthesis
by: Shamshad, Fahad, et al.
Published: (2026)
by: Shamshad, Fahad, et al.
Published: (2026)
STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing from Text-to-Image Diffusion Models
by: Srivatsan, Koushik, et al.
Published: (2024)
by: Srivatsan, Koushik, et al.
Published: (2024)
Towards Evaluating the Robustness of Visual State Space Models
by: Malik, Hashmat Shadab, et al.
Published: (2024)
by: Malik, Hashmat Shadab, et al.
Published: (2024)
Makeup-Guided Facial Privacy Protection via Untrained Neural Network Priors
by: Shamshad, Fahad, et al.
Published: (2024)
by: Shamshad, Fahad, et al.
Published: (2024)
RWKV-CLIP: A Robust Vision-Language Representation Learner
by: Gu, Tiancheng, et al.
Published: (2024)
by: Gu, Tiancheng, et al.
Published: (2024)
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
by: Munasinghe, Shehan, et al.
Published: (2024)
by: Munasinghe, Shehan, et al.
Published: (2024)
AgriCLIP: Adapting CLIP for Agriculture and Livestock via Domain-Specialized Cross-Model Alignment
by: Nawaz, Umair, et al.
Published: (2024)
by: Nawaz, Umair, et al.
Published: (2024)
MirrorCheck: Efficient Adversarial Defense for Vision-Language Models
by: Fares, Samar, et al.
Published: (2024)
by: Fares, Samar, et al.
Published: (2024)
Multi-Tailed Vision Transformer for Efficient Inference
by: Wang, Yunke, et al.
Published: (2022)
by: Wang, Yunke, et al.
Published: (2022)
DiffuseMix: Label-Preserving Data Augmentation with Diffusion Models
by: Islam, Khawar, et al.
Published: (2024)
by: Islam, Khawar, et al.
Published: (2024)
FaceAnonyMixer: Cancelable Faces via Identity Consistent Latent Space Mixing
by: Alam, Mohammed Talha, et al.
Published: (2025)
by: Alam, Mohammed Talha, et al.
Published: (2025)
Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners
by: Park, Keon-Hee, et al.
Published: (2024)
by: Park, Keon-Hee, et al.
Published: (2024)
Noise is an Efficient Learner for Zero-Shot Vision-Language Models
by: Imam, Raza, et al.
Published: (2025)
by: Imam, Raza, et al.
Published: (2025)
A Framework for Double-Blind Federated Adaptation of Foundation Models
by: Tastan, Nurbek, et al.
Published: (2025)
by: Tastan, Nurbek, et al.
Published: (2025)
VideoMolmo: Spatio-Temporal Grounding Meets Pointing
by: Ahmad, Ghazi Shazan, et al.
Published: (2025)
by: Ahmad, Ghazi Shazan, et al.
Published: (2025)
PE-CLIP: A Parameter-Efficient Fine-Tuning of Vision Language Models for Dynamic Facial Expression Recognition
by: Saadi, Ibtissam, et al.
Published: (2025)
by: Saadi, Ibtissam, et al.
Published: (2025)
Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learning
by: Zhong, Hanwen, et al.
Published: (2025)
by: Zhong, Hanwen, et al.
Published: (2025)
FullLoRA: Efficiently Boosting the Robustness of Pretrained Vision Transformers
by: Yuan, Zheng, et al.
Published: (2024)
by: Yuan, Zheng, et al.
Published: (2024)
Continual Few-shot Adaptation for Synthetic Fingerprint Detection
by: Benjamin, Joseph Geo, et al.
Published: (2026)
by: Benjamin, Joseph Geo, et al.
Published: (2026)
Multi-modal Attribute Prompting for Vision-Language Models
by: Liu, Xin, et al.
Published: (2024)
by: Liu, Xin, et al.
Published: (2024)
Similar Items
-
Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models
by: Imam, Raza, et al.
Published: (2024) -
PEMMA: Parameter-Efficient Multi-Modal Adaptation for Medical Image Segmentation
by: Saadi, Nada, et al.
Published: (2024) -
PromptSmooth: Certifying Robustness of Medical Vision-Language Models via Prompt Learning
by: Hussein, Noor, et al.
Published: (2024) -
Efficient Parameter Adaptation for Multi-Modal Medical Image Segmentation and Prognosis
by: Saeed, Numan, et al.
Published: (2025) -
Intra-finger Variability of Diffusion-based Latent Fingerprint Generation
by: Hussein, Noor, et al.
Published: (2026)