Saved in:
| Main Authors: | Saha, Shaibal, Li, Fan, Li, Yunge, Iyengar, Arun, Alves, Lucas, Xu, Lanyu |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.19834 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EfficientQuant: An Efficient Post-Training Quantization for CNN-Transformer Hybrid Models on Edge Devices
by: Saha, Shaibal, et al.
Published: (2025)
by: Saha, Shaibal, et al.
Published: (2025)
Vision Transformers on the Edge: A Comprehensive Survey of Model Compression and Acceleration Strategies
by: Saha, Shaibal, et al.
Published: (2025)
by: Saha, Shaibal, et al.
Published: (2025)
Neighbor-Aware Token Reduction via Hilbert Curve for Vision Transformers
by: Li, Yunge, et al.
Published: (2025)
by: Li, Yunge, et al.
Published: (2025)
MTMed3D: A Multi-Task Transformer-Based Model for 3D Medical Imaging
by: Li, Fan, et al.
Published: (2025)
by: Li, Fan, et al.
Published: (2025)
Hilbert-Guided Sparse Local Attention
by: Li, Yunge, et al.
Published: (2025)
by: Li, Yunge, et al.
Published: (2025)
Panoptic Perception for Autonomous Driving: A Survey
by: Li, Yunge, et al.
Published: (2024)
by: Li, Yunge, et al.
Published: (2024)
LLM-as-Judge Framework for Evaluating Tone-Induced Hallucination in Vision-Language Models
by: Jiang, Zhiyuan, et al.
Published: (2026)
by: Jiang, Zhiyuan, et al.
Published: (2026)
KD-CVG: A Knowledge-Driven Approach for Creative Video Generation
by: Liu, Linkai, et al.
Published: (2026)
by: Liu, Linkai, et al.
Published: (2026)
Judge Anything: MLLM as a Judge Across Any Modality
by: Pu, Shu, et al.
Published: (2025)
by: Pu, Shu, et al.
Published: (2025)
Omni-Judge: Can Omni-LLMs Serve as Human-Aligned Judges for Text-Conditioned Audio-Video Generation?
by: Liang, Susan, et al.
Published: (2026)
by: Liang, Susan, et al.
Published: (2026)
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
by: Lee, Jungsoo, et al.
Published: (2025)
by: Lee, Jungsoo, et al.
Published: (2025)
Self-Improving VLM Judges Without Human Annotations
by: Lin, Inna Wanyin, et al.
Published: (2025)
by: Lin, Inna Wanyin, et al.
Published: (2025)
ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges
by: Ai, Jiaxin, et al.
Published: (2025)
by: Ai, Jiaxin, et al.
Published: (2025)
CrossKD: Cross-Head Knowledge Distillation for Object Detection
by: Wang, Jiabao, et al.
Published: (2023)
by: Wang, Jiabao, et al.
Published: (2023)
DiffKD-DCIS: Predicting Upgrade of Ductal Carcinoma In Situ with Diffusion Augmentation and Knowledge Distillation
by: Li, Tao, et al.
Published: (2026)
by: Li, Tao, et al.
Published: (2026)
VideoJudge: Bootstrapping Enables Scalable Supervision of MLLM-as-a-Judge for Video Understanding
by: Waheed, Abdul, et al.
Published: (2025)
by: Waheed, Abdul, et al.
Published: (2025)
ClinKD: Cross-Modal Clinical Knowledge Distiller For Multi-Task Medical Images
by: Ge, Hongyu, et al.
Published: (2025)
by: Ge, Hongyu, et al.
Published: (2025)
Agri-CPJ: A Training-Free Explainable Framework for Agricultural Pest Diagnosis Using Caption-Prompt-Judge and LLM-as-a-Judge
by: Zhang, Wentao, et al.
Published: (2026)
by: Zhang, Wentao, et al.
Published: (2026)
TopKD: Top-scaled Knowledge Distillation
by: Wang, Qi, et al.
Published: (2025)
by: Wang, Qi, et al.
Published: (2025)
CPJ: Explainable Agricultural Pest Diagnosis via Caption-Prompt-Judge with LLM-Judged Refinement
by: Zhang, Wentao, et al.
Published: (2025)
by: Zhang, Wentao, et al.
Published: (2025)
Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning?
by: Laskar, Md Tahmid Rahman, et al.
Published: (2025)
by: Laskar, Md Tahmid Rahman, et al.
Published: (2025)
FairJudge: Abstention-Aware Multimodal Judges for Fairness and Alignment Evaluation in Text-to-Image Models
by: Sahili, Zahraa Al, et al.
Published: (2025)
by: Sahili, Zahraa Al, et al.
Published: (2025)
Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following
by: Xiong, Tianyi, et al.
Published: (2025)
by: Xiong, Tianyi, et al.
Published: (2025)
CopyJudge: Automated Copyright Infringement Identification and Mitigation in Text-to-Image Diffusion Models
by: Liu, Shunchang, et al.
Published: (2025)
by: Liu, Shunchang, et al.
Published: (2025)
Judge, Then Drive: A Critic-Centric Vision Language Action Framework for Autonomous Driving
by: Yang, Lijin, et al.
Published: (2026)
by: Yang, Lijin, et al.
Published: (2026)
A Visually Impaired Assistance Benchmark for VLM-as-a-Judge Evaluation
by: Zhao, Yi, et al.
Published: (2026)
by: Zhao, Yi, et al.
Published: (2026)
MapKD: Unlocking Prior Knowledge with Cross-Modal Distillation for Efficient Online HD Map Construction
by: Yan, Ziyang, et al.
Published: (2025)
by: Yan, Ziyang, et al.
Published: (2025)
MoKD: Multi-Task Optimization for Knowledge Distillation
by: Hayder, Zeeshan, et al.
Published: (2025)
by: Hayder, Zeeshan, et al.
Published: (2025)
EA-KD: Entropy-based Adaptive Knowledge Distillation
by: Su, Chi-Ping, et al.
Published: (2023)
by: Su, Chi-Ping, et al.
Published: (2023)
BD-KD: Balancing the Divergences for Online Knowledge Distillation
by: Amara, Ibtihel, et al.
Published: (2022)
by: Amara, Ibtihel, et al.
Published: (2022)
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
by: Wang, Yu, et al.
Published: (2022)
by: Wang, Yu, et al.
Published: (2022)
ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks
by: Yang, Yan, et al.
Published: (2025)
by: Yang, Yan, et al.
Published: (2025)
MLLM-as-a-Judge Exhibits Model Preference Bias
by: Koyama, Shuitsu, et al.
Published: (2026)
by: Koyama, Shuitsu, et al.
Published: (2026)
VLIC: Vision-Language Models As Perceptual Judges for Human-Aligned Image Compression
by: Sargent, Kyle, et al.
Published: (2025)
by: Sargent, Kyle, et al.
Published: (2025)
ComKD-CLIP: Comprehensive Knowledge Distillation for Contrastive Language-Image Pre-traning Model
by: Chen, Yifan, et al.
Published: (2024)
by: Chen, Yifan, et al.
Published: (2024)
MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge
by: Lee, Sua, et al.
Published: (2026)
by: Lee, Sua, et al.
Published: (2026)
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
by: Chen, Dongping, et al.
Published: (2024)
by: Chen, Dongping, et al.
Published: (2024)
OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation
by: Zhou, Pengfei, et al.
Published: (2024)
by: Zhou, Pengfei, et al.
Published: (2024)
ACAM-KD: Adaptive and Cooperative Attention Masking for Knowledge Distillation
by: Lan, Qizhen, et al.
Published: (2025)
by: Lan, Qizhen, et al.
Published: (2025)
FreeKD: Knowledge Distillation via Semantic Frequency Prompt
by: Zhang, Yuan, et al.
Published: (2023)
by: Zhang, Yuan, et al.
Published: (2023)
Similar Items
-
EfficientQuant: An Efficient Post-Training Quantization for CNN-Transformer Hybrid Models on Edge Devices
by: Saha, Shaibal, et al.
Published: (2025) -
Vision Transformers on the Edge: A Comprehensive Survey of Model Compression and Acceleration Strategies
by: Saha, Shaibal, et al.
Published: (2025) -
Neighbor-Aware Token Reduction via Hilbert Curve for Vision Transformers
by: Li, Yunge, et al.
Published: (2025) -
MTMed3D: A Multi-Task Transformer-Based Model for 3D Medical Imaging
by: Li, Fan, et al.
Published: (2025) -
Hilbert-Guided Sparse Local Attention
by: Li, Yunge, et al.
Published: (2025)