:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Soroka, Emi, Arzyn, Artem
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Computer Vision and Pattern Recognition I.4
Online Access:	https://arxiv.org/abs/2511.03046
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

DatUS^2: Data-driven Unsupervised Semantic Segmentation with Pre-trained Self-supervised Vision Transformer
by: Kumar, Sonal, et al.
Published: (2024)

Domain-Specific Self-Supervised Pre-training for Agricultural Disease Classification: A Hierarchical Vision Transformer Study
by: Sonavane, Arnav S.
Published: (2026)

SynGen-Vision: Synthetic Data Generation for training industrial vision models
by: Dubey, Alpana, et al.
Published: (2025)

DIET-CP: Lightweight and Data Efficient Self Supervised Continued Pretraining
by: Rodas, Bryan, et al.
Published: (2025)

Do All Vision Transformers Need Registers? A Cross-Architectural Reassessment
by: Baxevanakis, Spiros, et al.
Published: (2026)

Massively Multi-Person 3D Human Motion Forecasting with Scene Context
by: Mueller, Felix B, et al.
Published: (2024)

PyCAT4: A Hierarchical Vision Transformer-based Framework for 3D Human Pose Estimation
by: Yang, Zongyou, et al.
Published: (2025)

On the Domain Robustness of Contrastive Vision-Language Models
by: Koddenbrock, Mario, et al.
Published: (2025)

Unified Local and Global Attention Interaction Modeling for Vision Transformers
by: Nguyen, Tan, et al.
Published: (2024)

Streetscape Analysis with Generative AI (SAGAI): Vision-Language Assessment and Mapping of Urban Scenes
by: Perez, Joan, et al.
Published: (2025)

SynthEnsemble: A Fusion of CNN, Vision Transformer, and Hybrid Models for Multi-Label Chest X-Ray Classification
by: Ashraf, S. M. Nabil, et al.
Published: (2023)

A Review of Pseudo-Labeling for Computer Vision
by: Kage, Patrick, et al.
Published: (2024)

Exploring Visual Embedding Spaces Induced by Vision Transformers for Online Auto Parts Marketplaces
by: Armijo, Cameron, et al.
Published: (2025)

VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning
by: Li, Wenhao, et al.
Published: (2025)

In Context Learning with Vision Transformers: Case Study
by: Zhao, Antony, et al.
Published: (2025)

Attention-Aware Transformer-Based Aggregation Network for Video Periocular Recognition
by: Carreira, Luiz G F, et al.
Published: (2026)

Application of Generative Adversarial Network (GAN) for Synthetic Training Data Creation to improve performance of ANN Classifier for extracting Built-Up pixels from Landsat Satellite Imagery
by: Mukherjee, Amritendu, et al.
Published: (2025)

Residual Vision Transformer (ResViT) Based Self-Supervised Learning Model for Brain Tumor Classification
by: Karagoz, Meryem Altin, et al.
Published: (2024)

Vision Transformer-based Model for Severity Quantification of Lung Pneumonia Using Chest X-ray Images
by: Slika, Bouthaina, et al.
Published: (2023)

High-Entropy Tokens as Multimodal Failure Points in Vision-Language Models
by: He, Mengqi, et al.
Published: (2025)

Disentangling Generation and Regression in Stochastic Interpolants for Controllable Image Restoration
by: Liu, Yi, et al.
Published: (2026)

Fusion and Grouping Strategies in Deep Learning for Local Climate Zone Classification of Multimodal Remote Sensing Data
by: Thomas, Ancymol, et al.
Published: (2026)

Comparative Analysis of Vision Transformers and Convolutional Neural Networks for Medical Image Classification
by: Kawadkar, Kunal
Published: (2025)

Data-driven Super-Resolution of Flood Inundation Maps using Synthetic Simulations
by: Aravamudan, Akshay, et al.
Published: (2025)

Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers?
by: Maity, Subhajit, et al.
Published: (2025)

DesertFormer: Transformer-Based Semantic Segmentation for Off-Road Desert Terrain Classification in Autonomous Navigation Systems
by: Chebolu, Yasaswini
Published: (2026)

Serpent: Scalable and Efficient Image Restoration via Multi-scale Structured State Space Models
by: Sepehri, Mohammad Shahab, et al.
Published: (2024)

Diffusing DeBias: Synthetic Bias Amplification for Model Debiasing
by: Ciranni, Massimiliano, et al.
Published: (2025)

LeDiFlow: Learned Distribution-guided Flow Matching to Accelerate Image Generation
by: Zwick, Pascal, et al.
Published: (2025)

Looking at Model Debiasing through the Lens of Anomaly Detection
by: Pastore, Vito Paolo, et al.
Published: (2024)

Global-Local Similarity for Efficient Fine-Grained Image Recognition with Vision Transformers
by: Rios, Edwin Arkel, et al.
Published: (2024)

FutureHuman3D: Forecasting Complex Long-Term 3D Human Behavior from Video Observations
by: Diller, Christian, et al.
Published: (2022)

A Spitting Image: Modular Superpixel Tokenization in Vision Transformers
by: Aasan, Marius, et al.
Published: (2024)

Boosting Model Resilience via Implicit Adversarial Data Augmentation
by: Zhou, Xiaoling, et al.
Published: (2024)

A Genealogy of Foundation Models in Remote Sensing
by: Lane, Kevin, et al.
Published: (2025)

Comparison Study: Glacier Calving Front Delineation in Synthetic Aperture Radar Images With Deep Learning
by: Gourmelon, Nora, et al.
Published: (2025)

Label Delay in Online Continual Learning
by: Csaba, Botos, et al.
Published: (2023)

An Immersive Multi-Elevation Multi-Seasonal Dataset for 3D Reconstruction and Visualization
by: Liu, Xijun, et al.
Published: (2024)

Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights
by: Hao, Yan, et al.
Published: (2024)

From Misclassifications to Outliers: Joint Reliability Assessment in Classification
by: Li, Yang, et al.
Published: (2026)