Saved in:
| Main Authors: | Kwok, Wing Man Casca, Tung, Yip Chiu, Bhagchandani, Kunal |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.03607 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CAPEEN: Image Captioning with Early Exits and Knowledge Distillation
by: Bajpai, Divya Jyoti, et al.
Published: (2024)
by: Bajpai, Divya Jyoti, et al.
Published: (2024)
Edge-Efficient Image Restoration: Transformer Distillation into State-Space Models
by: Miriyala, Srinivas Soumitri, et al.
Published: (2026)
by: Miriyala, Srinivas Soumitri, et al.
Published: (2026)
Compositional Oil Spill Detection Based on Object Detector and Adapted Segment Anything Model from SAR Images
by: Wu, Wenhui, et al.
Published: (2024)
by: Wu, Wenhui, et al.
Published: (2024)
An Edge AI System Based on FPGA Platform for Railway Fault Detection
by: Li, Jiale, et al.
Published: (2024)
by: Li, Jiale, et al.
Published: (2024)
A Novel Lightweight Transformer with Edge-Aware Fusion for Remote Sensing Image Captioning
by: Das, Swadhin, et al.
Published: (2025)
by: Das, Swadhin, et al.
Published: (2025)
Efficient Knowledge Distillation of SAM for Medical Image Segmentation
by: Patil, Kunal Dasharath, et al.
Published: (2025)
by: Patil, Kunal Dasharath, et al.
Published: (2025)
A Transformer-in-Transformer Network Utilizing Knowledge Distillation for Image Recognition
by: Rahman, Dewan Tauhid, et al.
Published: (2025)
by: Rahman, Dewan Tauhid, et al.
Published: (2025)
Token Compression Meets Compact Vision Transformers: A Survey and Comparative Evaluation for Edge AI
by: Nguyen, Phat, et al.
Published: (2025)
by: Nguyen, Phat, et al.
Published: (2025)
UIT-DarkCow team at ImageCLEFmedical Caption 2024: Diagnostic Captioning for Radiology Images Efficiency with Transformer Models
by: Van Nguyen, Quan, et al.
Published: (2024)
by: Van Nguyen, Quan, et al.
Published: (2024)
CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era
by: Cheng, Kanzhi, et al.
Published: (2025)
by: Cheng, Kanzhi, et al.
Published: (2025)
Where Do Images Come From? Analyzing Captions to Geographically Profile Datasets
by: Basu, Abhipsa, et al.
Published: (2026)
by: Basu, Abhipsa, et al.
Published: (2026)
Dual-Stream Collaborative Transformer for Image Captioning
by: Wan, Jun, et al.
Published: (2026)
by: Wan, Jun, et al.
Published: (2026)
Efficient Few-Shot Learning for Edge AI via Knowledge Distillation on MobileViT
by: Tsuyuki, Shuhei, et al.
Published: (2026)
by: Tsuyuki, Shuhei, et al.
Published: (2026)
Analyzing Image Beyond Visual Aspect: Image Emotion Classification via Multiple-Affective Captioning
by: Zhou, Zibo, et al.
Published: (2025)
by: Zhou, Zibo, et al.
Published: (2025)
Image Generation from Image Captioning -- Invertible Approach
by: Menon, Nandakishore S, et al.
Published: (2024)
by: Menon, Nandakishore S, et al.
Published: (2024)
Beam-Guided Knowledge Replay for Knowledge-Rich Image Captioning using Vision-Language Model
by: AlJunaid, Reem, et al.
Published: (2025)
by: AlJunaid, Reem, et al.
Published: (2025)
Towards Optimal Trade-offs in Knowledge Distillation for CNNs and Vision Transformers at the Edge
by: Violos, John, et al.
Published: (2024)
by: Violos, John, et al.
Published: (2024)
Shifted Window Fourier Transform And Retention For Image Captioning
by: Hu, Jia Cheng, et al.
Published: (2024)
by: Hu, Jia Cheng, et al.
Published: (2024)
Automated Image Captioning with CNNs and Transformers
by: Cahyono, Joshua Adrian, et al.
Published: (2024)
by: Cahyono, Joshua Adrian, et al.
Published: (2024)
EdgeGaussians -- 3D Edge Mapping via Gaussian Splatting
by: Chelani, Kunal, et al.
Published: (2024)
by: Chelani, Kunal, et al.
Published: (2024)
Knowledge Distillation via the Target-aware Transformer
by: Lin, Sihao, et al.
Published: (2022)
by: Lin, Sihao, et al.
Published: (2022)
Context-aware Difference Distilling for Multi-change Captioning
by: Tu, Yunbin, et al.
Published: (2024)
by: Tu, Yunbin, et al.
Published: (2024)
m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers
by: Lo, Ka Man, et al.
Published: (2024)
by: Lo, Ka Man, et al.
Published: (2024)
CaptionFool: Universal Image Captioning Model Attacks
by: Parekh, Swapnil
Published: (2026)
by: Parekh, Swapnil
Published: (2026)
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
by: Lee, Jungsoo, et al.
Published: (2025)
by: Lee, Jungsoo, et al.
Published: (2025)
CaptionQA: Is Your Caption as Useful as the Image Itself?
by: Yang, Shijia, et al.
Published: (2025)
by: Yang, Shijia, et al.
Published: (2025)
Point Clouds Are Specialized Images: A Knowledge Transfer Approach for 3D Understanding
by: Kang, Jiachen, et al.
Published: (2023)
by: Kang, Jiachen, et al.
Published: (2023)
Transformer based Multitask Learning for Image Captioning and Object Detection
by: Basak, Debolena, et al.
Published: (2024)
by: Basak, Debolena, et al.
Published: (2024)
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
by: Wang, Yu, et al.
Published: (2022)
by: Wang, Yu, et al.
Published: (2022)
HiSem: Hierarchical Semantic Disentangling for Remote Sensing Image Change Captioning
by: Wang, Man, et al.
Published: (2026)
by: Wang, Man, et al.
Published: (2026)
Detecting and Understanding Hateful Contents in Memes Through Captioning and Visual Question-Answering
by: Anaissi, Ali, et al.
Published: (2025)
by: Anaissi, Ali, et al.
Published: (2025)
Embedded Heterogeneous Attention Transformer for Cross-lingual Image Captioning
by: Song, Zijie, et al.
Published: (2023)
by: Song, Zijie, et al.
Published: (2023)
Transformer Architecture for NetsDB
by: Kamble, Subodh, et al.
Published: (2024)
by: Kamble, Subodh, et al.
Published: (2024)
Knowledge Distillation in Vision Transformers: A Critical Review
by: Habib, Gousia, et al.
Published: (2023)
by: Habib, Gousia, et al.
Published: (2023)
Knowledge Distillation via Query Selection for Detection Transformer
by: Liu, Yi, et al.
Published: (2024)
by: Liu, Yi, et al.
Published: (2024)
Improving Interpretable Embeddings for Ad-hoc Video Search with Generative Captions and Multi-word Concept Bank
by: Wu, Jiaxin, et al.
Published: (2024)
by: Wu, Jiaxin, et al.
Published: (2024)
CaptionSmiths: Flexibly Controlling Language Pattern in Image Captioning
by: Saito, Kuniaki, et al.
Published: (2025)
by: Saito, Kuniaki, et al.
Published: (2025)
Caption-Matching: A Multimodal Approach for Cross-Domain Image Retrieval
by: Iijima, Lucas, et al.
Published: (2024)
by: Iijima, Lucas, et al.
Published: (2024)
Adjust Your Focus: Defocus Deblurring From Dual-Pixel Images Using Explicit Multi-Scale Cross-Correlation
by: Swami, Kunal
Published: (2025)
by: Swami, Kunal
Published: (2025)
A Lightweight Sparse Focus Transformer for Remote Sensing Image Change Captioning
by: Sun, Dongwei, et al.
Published: (2024)
by: Sun, Dongwei, et al.
Published: (2024)
Similar Items
-
CAPEEN: Image Captioning with Early Exits and Knowledge Distillation
by: Bajpai, Divya Jyoti, et al.
Published: (2024) -
Edge-Efficient Image Restoration: Transformer Distillation into State-Space Models
by: Miriyala, Srinivas Soumitri, et al.
Published: (2026) -
Compositional Oil Spill Detection Based on Object Detector and Adapted Segment Anything Model from SAR Images
by: Wu, Wenhui, et al.
Published: (2024) -
An Edge AI System Based on FPGA Platform for Railway Fault Detection
by: Li, Jiale, et al.
Published: (2024) -
A Novel Lightweight Transformer with Edge-Aware Fusion for Remote Sensing Image Captioning
by: Das, Swadhin, et al.
Published: (2025)