Saved in:
| Main Authors: | Si, Jongwook, Kim, Sungyoung |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.00827 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Satellite Image Utilization for Dehazing with Swin Transformer-Hybrid U-Net and Watershed loss
by: Si, Jongwook, et al.
Published: (2025)
by: Si, Jongwook, et al.
Published: (2025)
Single Image Rain Streak Removal Using Harris Corner Loss and R-CBAM Network
by: Si, Jongwook, et al.
Published: (2025)
by: Si, Jongwook, et al.
Published: (2025)
U-REPA: Aligning Diffusion U-Nets to ViTs
by: Tian, Yuchuan, et al.
Published: (2025)
by: Tian, Yuchuan, et al.
Published: (2025)
Hybrid CNN-ViT Framework for Motion-Blurred Scene Text Restoration
by: Rashid, Umar, et al.
Published: (2025)
by: Rashid, Umar, et al.
Published: (2025)
Exploring Plain ViT Reconstruction for Multi-class Unsupervised Anomaly Detection
by: Zhang, Jiangning, et al.
Published: (2023)
by: Zhang, Jiangning, et al.
Published: (2023)
DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs
by: Kim, Donghyun, et al.
Published: (2024)
by: Kim, Donghyun, et al.
Published: (2024)
Mobile U-ViT: Revisiting large kernel and U-shaped ViT for efficient medical image segmentation
by: Tang, Fenghe, et al.
Published: (2025)
by: Tang, Fenghe, et al.
Published: (2025)
Deeper Inside Deep ViT
by: Hong, Sungrae
Published: (2025)
by: Hong, Sungrae
Published: (2025)
I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization
by: Zhong, Yunshan, et al.
Published: (2023)
by: Zhong, Yunshan, et al.
Published: (2023)
CNN and ViT Efficiency Study on Tiny ImageNet and DermaMNIST Datasets
by: Amangeldi, Aidar, et al.
Published: (2025)
by: Amangeldi, Aidar, et al.
Published: (2025)
APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers
by: Wu, Zhuguanyu, et al.
Published: (2025)
by: Wu, Zhuguanyu, et al.
Published: (2025)
How to train your ViT for OOD Detection
by: Mueller, Maximilian, et al.
Published: (2024)
by: Mueller, Maximilian, et al.
Published: (2024)
Neural Gabor Splatting: Enhanced Gaussian Splatting with Neural Gabor for High-frequency Surface Reconstruction
by: Watanabe, Haato, et al.
Published: (2026)
by: Watanabe, Haato, et al.
Published: (2026)
RepViT: Revisiting Mobile CNN From ViT Perspective
by: Wang, Ao, et al.
Published: (2023)
by: Wang, Ao, et al.
Published: (2023)
Frequency-Adaptive Discrete Cosine-ViT-ResNet Architecture for Sparse-Data Vision
by: Kang, Ziyue, et al.
Published: (2025)
by: Kang, Ziyue, et al.
Published: (2025)
AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs
by: Zheng, Yunling, et al.
Published: (2024)
by: Zheng, Yunling, et al.
Published: (2024)
Filtered-ViT: A Robust Defense Against Multiple Adversarial Patch Attacks
by: Khanal, Aja, et al.
Published: (2025)
by: Khanal, Aja, et al.
Published: (2025)
STRAP-ViT: Segregated Tokens with Randomized -- Transformations for Defense against Adversarial Patches in ViTs
by: Chattopadhyay, Nandish, et al.
Published: (2026)
by: Chattopadhyay, Nandish, et al.
Published: (2026)
Rethinking Random Masking in Self-Distillation on ViT
by: Seong, Jihyeon, et al.
Published: (2025)
by: Seong, Jihyeon, et al.
Published: (2025)
Your ViT is Secretly an Image Segmentation Model
by: Kerssies, Tommie, et al.
Published: (2025)
by: Kerssies, Tommie, et al.
Published: (2025)
YOLO-Former: YOLO Shakes Hand With ViT
by: Khoramdel, Javad, et al.
Published: (2024)
by: Khoramdel, Javad, et al.
Published: (2024)
ViT-5: Vision Transformers for The Mid-2020s
by: Wang, Feng, et al.
Published: (2026)
by: Wang, Feng, et al.
Published: (2026)
ViTCAE: ViT-based Class-conditioned Autoencoder
by: Jebraeeli, Vahid, et al.
Published: (2025)
by: Jebraeeli, Vahid, et al.
Published: (2025)
MMeViT: Multi-Modal ensemble ViT for Post-Stroke Rehabilitation Action Recognition
by: Kim, Ye-eun, et al.
Published: (2025)
by: Kim, Ye-eun, et al.
Published: (2025)
DiffPoint: Single and Multi-view Point Cloud Reconstruction with ViT Based Diffusion Model
by: Feng, Yu, et al.
Published: (2024)
by: Feng, Yu, et al.
Published: (2024)
LogitDynamics: Reliable ViT Error Detection from Layerwise Logit Trajectories
by: Beigelman, Ido, et al.
Published: (2026)
by: Beigelman, Ido, et al.
Published: (2026)
ViT$^3$: Unlocking Test-Time Training in Vision
by: Han, Dongchen, et al.
Published: (2025)
by: Han, Dongchen, et al.
Published: (2025)
EA-ViT: Efficient Adaptation for Elastic Vision Transformer
by: Zhu, Chen, et al.
Published: (2025)
by: Zhu, Chen, et al.
Published: (2025)
Vanilla ViT for Automotive Point Cloud Semantic Segmentation
by: Puy, Gilles, et al.
Published: (2026)
by: Puy, Gilles, et al.
Published: (2026)
Applying ViT in Generalized Few-shot Semantic Segmentation
by: Geng, Liyuan, et al.
Published: (2024)
by: Geng, Liyuan, et al.
Published: (2024)
One-Shot Multilingual Font Generation Via ViT
by: Wang, Zhiheng, et al.
Published: (2024)
by: Wang, Zhiheng, et al.
Published: (2024)
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
by: Ibtehaz, Nabil, et al.
Published: (2024)
by: Ibtehaz, Nabil, et al.
Published: (2024)
Purrturbed but Stable: Human-Cat Invariant Representations Across CNNs, ViTs and Self-Supervised ViTs
by: Shah, Arya, et al.
Published: (2025)
by: Shah, Arya, et al.
Published: (2025)
FilterViT and DropoutViT
by: Sun, Bohang
Published: (2024)
by: Sun, Bohang
Published: (2024)
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection
by: Gao, Xiangyu, et al.
Published: (2025)
by: Gao, Xiangyu, et al.
Published: (2025)
ViT-DD: Multi-Task Vision Transformer for Semi-Supervised Driver Distraction Detection
by: Ma, Yunsheng, et al.
Published: (2022)
by: Ma, Yunsheng, et al.
Published: (2022)
OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery
by: Inkawhich, Matthew, et al.
Published: (2024)
by: Inkawhich, Matthew, et al.
Published: (2024)
A Hybrid CNN-ViT-GNN Framework with GAN-Based Augmentation for Intelligent Weed Detection in Precision Agriculture
by: V, Pandiyaraju, et al.
Published: (2025)
by: V, Pandiyaraju, et al.
Published: (2025)
FViT: A Focal Vision Transformer with Gabor Filter
by: Shi, Yulong, et al.
Published: (2024)
by: Shi, Yulong, et al.
Published: (2024)
Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection
by: Aubard, Martin, et al.
Published: (2024)
by: Aubard, Martin, et al.
Published: (2024)
Similar Items
-
Satellite Image Utilization for Dehazing with Swin Transformer-Hybrid U-Net and Watershed loss
by: Si, Jongwook, et al.
Published: (2025) -
Single Image Rain Streak Removal Using Harris Corner Loss and R-CBAM Network
by: Si, Jongwook, et al.
Published: (2025) -
U-REPA: Aligning Diffusion U-Nets to ViTs
by: Tian, Yuchuan, et al.
Published: (2025) -
Hybrid CNN-ViT Framework for Motion-Blurred Scene Text Restoration
by: Rashid, Umar, et al.
Published: (2025) -
Exploring Plain ViT Reconstruction for Multi-class Unsupervised Anomaly Detection
by: Zhang, Jiangning, et al.
Published: (2023)