Saved in:
| Main Authors: | Krishnan, Prashant, Wang, Zilong, Wang, Yangkun, Shang, Jingbo |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2305.14828 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multi-Level Correlation Network For Few-Shot Image Classification
by: Dang, Yunkai, et al.
Published: (2024)
by: Dang, Yunkai, et al.
Published: (2024)
Modularized Networks for Few-shot Hateful Meme Detection
by: Cao, Rui, et al.
Published: (2024)
by: Cao, Rui, et al.
Published: (2024)
EAMA : Entity-Aware Multimodal Alignment Based Approach for News Image Captioning
by: Zhang, Junzhe, et al.
Published: (2024)
by: Zhang, Junzhe, et al.
Published: (2024)
LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition
by: Li, Jinyuan, et al.
Published: (2024)
by: Li, Jinyuan, et al.
Published: (2024)
Joint Image-Instance Spatial-Temporal Attention for Few-shot Action Recognition
by: Qian, Zefeng, et al.
Published: (2025)
by: Qian, Zefeng, et al.
Published: (2025)
IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning
by: Lee, Soeun, et al.
Published: (2024)
by: Lee, Soeun, et al.
Published: (2024)
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
by: Lai, Bolin, et al.
Published: (2024)
by: Lai, Bolin, et al.
Published: (2024)
Harmfully Manipulated Images Matter in Multimodal Misinformation Detection
by: Wang, Bing, et al.
Published: (2024)
by: Wang, Bing, et al.
Published: (2024)
Grounding Language Models for Visual Entity Recognition
by: Xiao, Zilin, et al.
Published: (2024)
by: Xiao, Zilin, et al.
Published: (2024)
RobustEMD: Domain Robust Matching for Cross-domain Few-shot Medical Image Segmentation
by: Zhu, Yazhou, et al.
Published: (2024)
by: Zhu, Yazhou, et al.
Published: (2024)
MOFI: Learning Image Representations from Noisy Entity Annotated Images
by: Wu, Wentao, et al.
Published: (2023)
by: Wu, Wentao, et al.
Published: (2023)
Support-Query Prototype Fusion Network for Few-shot Medical Image Segmentation
by: Wu, Xiaoxiao, et al.
Published: (2024)
by: Wu, Xiaoxiao, et al.
Published: (2024)
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
by: Tang, Yuwei, et al.
Published: (2024)
by: Tang, Yuwei, et al.
Published: (2024)
CEIDM: A Controlled Entity and Interaction Diffusion Model for Enhanced Text-to-Image Generation
by: Yang, Mingyue, et al.
Published: (2025)
by: Yang, Mingyue, et al.
Published: (2025)
TextTIGER: Text-based Intelligent Generation with Entity Prompt Refinement for Text-to-Image Generation
by: Ozaki, Shintaro, et al.
Published: (2025)
by: Ozaki, Shintaro, et al.
Published: (2025)
Task-Adapter: Task-specific Adaptation of Image Models for Few-shot Action Recognition
by: Cao, Congqi, et al.
Published: (2024)
by: Cao, Congqi, et al.
Published: (2024)
Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching
by: Wang, Bin, et al.
Published: (2024)
by: Wang, Bin, et al.
Published: (2024)
Reconstruction Guided Few-shot Network For Remote Sensing Image Classification
by: Jaiswal, Mohit, et al.
Published: (2026)
by: Jaiswal, Mohit, et al.
Published: (2026)
SPENet: Self-guided Prototype Enhancement Network for Few-shot Medical Image Segmentation
by: Fan, Chao, et al.
Published: (2025)
by: Fan, Chao, et al.
Published: (2025)
Analyzing Images of Legal Documents: Toward Multi-Modal LLMs for Access to Justice
by: Westermann, Hannes, et al.
Published: (2024)
by: Westermann, Hannes, et al.
Published: (2024)
Siamese Transformer Networks for Few-shot Image Classification
by: Jiang, Weihao, et al.
Published: (2024)
by: Jiang, Weihao, et al.
Published: (2024)
SceneAlign: Aligning Multimodal Reasoning to Scene Graphs in Complex Visual Scenes
by: Wang, Chuhan, et al.
Published: (2026)
by: Wang, Chuhan, et al.
Published: (2026)
Pose2Gest: A Few-Shot Model-Free Approach Applied In South Indian Classical Dance Gesture Recognition
by: Raju, Kavitha, et al.
Published: (2024)
by: Raju, Kavitha, et al.
Published: (2024)
Towards Automatic Evaluation for Image Transcreation
by: Khanuja, Simran, et al.
Published: (2024)
by: Khanuja, Simran, et al.
Published: (2024)
Multi-Grained Query-Guided Set Prediction Network for Grounded Multimodal Named Entity Recognition
by: Tang, Jielong, et al.
Published: (2024)
by: Tang, Jielong, et al.
Published: (2024)
Benchmarking Graph Neural Networks for Document Layout Analysis in Public Affairs
by: Lopez-Duran, Miguel, et al.
Published: (2025)
by: Lopez-Duran, Miguel, et al.
Published: (2025)
E2E-GMNER: End-to-End Generative Grounded Multimodal Named Entity Recognition
by: Zhang, Meng, et al.
Published: (2026)
by: Zhang, Meng, et al.
Published: (2026)
MoEController: Instruction-based Arbitrary Image Manipulation with Mixture-of-Expert Controllers
by: Li, Sijia, et al.
Published: (2023)
by: Li, Sijia, et al.
Published: (2023)
Efficient Few-Shot Medical Image Analysis via Hierarchical Contrastive Vision-Language Learning
by: Fuller, Harrison, et al.
Published: (2025)
by: Fuller, Harrison, et al.
Published: (2025)
Few-shot Adaptation to Distribution Shifts By Mixing Source and Target Embeddings
by: Xue, Yihao, et al.
Published: (2023)
by: Xue, Yihao, et al.
Published: (2023)
UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding
by: Wang, Zhecan, et al.
Published: (2023)
by: Wang, Zhecan, et al.
Published: (2023)
Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual Recognition
by: Liu, Haiqi, et al.
Published: (2023)
by: Liu, Haiqi, et al.
Published: (2023)
Restoring Ancient Ideograph: A Multimodal Multitask Neural Network Approach
by: Duan, Siyu, et al.
Published: (2024)
by: Duan, Siyu, et al.
Published: (2024)
Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation
by: Ye, Junyan, et al.
Published: (2025)
by: Ye, Junyan, et al.
Published: (2025)
Few-shot Image Generation via Masked Discrimination
by: Zhu, Jingyuan, et al.
Published: (2022)
by: Zhu, Jingyuan, et al.
Published: (2022)
Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated Images
by: Brokman, Jonathan, et al.
Published: (2025)
by: Brokman, Jonathan, et al.
Published: (2025)
Augmenting Prototype Network with TransMix for Few-shot Hyperspectral Image Classification
by: Liu, Chun, et al.
Published: (2024)
by: Liu, Chun, et al.
Published: (2024)
Exploiting DINOv3-Based Self-Supervised Features for Robust Few-Shot Medical Image Segmentation
by: Xu, Guoping, et al.
Published: (2026)
by: Xu, Guoping, et al.
Published: (2026)
DiffChat: Learning to Chat with Text-to-Image Synthesis Models for Interactive Image Creation
by: Wang, Jiapeng, et al.
Published: (2024)
by: Wang, Jiapeng, et al.
Published: (2024)
CLIP-guided Prototype Modulating for Few-shot Action Recognition
by: Wang, Xiang, et al.
Published: (2023)
by: Wang, Xiang, et al.
Published: (2023)
Similar Items
-
Multi-Level Correlation Network For Few-Shot Image Classification
by: Dang, Yunkai, et al.
Published: (2024) -
Modularized Networks for Few-shot Hateful Meme Detection
by: Cao, Rui, et al.
Published: (2024) -
EAMA : Entity-Aware Multimodal Alignment Based Approach for News Image Captioning
by: Zhang, Junzhe, et al.
Published: (2024) -
LLMs as Bridges: Reformulating Grounded Multimodal Named Entity Recognition
by: Li, Jinyuan, et al.
Published: (2024) -
Joint Image-Instance Spatial-Temporal Attention for Few-shot Action Recognition
by: Qian, Zefeng, et al.
Published: (2025)