:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Abdullah, Abdullah Nazhat, Aydin, Tarkan
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2405.15953
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

NiNformer: A Network in Network Transformer with Token Mixing Generated Gating Function
by: Abdullah, Abdullah Nazhat, et al.
Published: (2024)

LoLA-SpecViT: Local Attention SwiGLU Vision Transformer with LoRA for Hyperspectral Imaging
by: Zidi, Fadi Abdeladhim, et al.
Published: (2025)

ABFR-KAN: Kolmogorov-Arnold Networks for Functional Brain Analysis
by: Ward, Tyler, et al.
Published: (2026)

ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers
by: Jiang, Yanfeng, et al.
Published: (2024)

Layout Anything: One Transformer for Universal Room Layout Estimation
by: Mia, Md Sohag, et al.
Published: (2025)

TAP into the Patch Tokens: Leveraging Vision Foundation Model Features for AI-Generated Image Detection
by: Abdullah, Ahmed, et al.
Published: (2026)

Prompting Medical Vision-Language Models to Mitigate Diagnosis Bias by Generating Realistic Dermoscopic Images
by: Munia, Nusrat, et al.
Published: (2025)

Can Cross-Layer Transcoders Replace Vision Transformer Activations? An Interpretable Perspective on Vision
by: Chatzoudis, Gerasimos, et al.
Published: (2026)

Trainable Highly-expressive Activation Functions
by: Chelly, Irit, et al.
Published: (2024)

MDE-VIO: Enhancing Visual-Inertial Odometry Using Learned Depth Priors
by: Alniak, Arda, et al.
Published: (2026)

Improving Brain Disorder Diagnosis with Advanced Brain Function Representation and Kolmogorov-Arnold Networks
by: Ward, Tyler, et al.
Published: (2025)

Steering Video Diffusion Transformers with Massive Activations
by: Cheng, Xianhang, et al.
Published: (2026)

MixA-Q: Revisiting Activation Sparsity for Vision Transformers from a Mixed-Precision Quantization Perspective
by: Wang, Weitian, et al.
Published: (2025)

L-SWAG: Layer-Sample Wise Activation with Gradients information for Zero-Shot NAS on Vision Transformers
by: Casarin, Sofia, et al.
Published: (2025)

Adaptive Parametric Activation: Unifying and Generalising Activation Functions Across Tasks
by: Alexandridis, Konstantinos Panagiotis, et al.
Published: (2024)

EVCC: Enhanced Vision Transformer-ConvNeXt-CoAtNet Fusion for Classification
by: Hasan, Kazi Reyazul, et al.
Published: (2025)

Vision-Centric Activation and Coordination for Multimodal Large Language Models
by: Wang, Yunnan, et al.
Published: (2025)

Pix4Point: Image Pretrained Standard Transformers for 3D Point Cloud Understanding
by: Qian, Guocheng, et al.
Published: (2022)

Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations
by: Gan, Chaofan, et al.
Published: (2025)

Massive Activations are the Key to Local Detail Synthesis in Diffusion Transformers
by: Gan, Chaofan, et al.
Published: (2025)

A Probabilistic Segment Anything Model for Ambiguity-Aware Medical Image Segmentation
by: Ward, Tyler, et al.
Published: (2025)

DArFace: Deformation Aware Robustness for Low Quality Face Recognition
by: Gulshad, Sadaf, et al.
Published: (2025)

Class-N-Diff: Classification-Induced Diffusion Model Can Make Fair Skin Cancer Diagnosis
by: Munia, Nusrat, et al.
Published: (2025)

Skin Lesion Classification Using a Soft Voting Ensemble of Convolutional Neural Networks
by: Shafi, Abdullah Al, et al.
Published: (2025)

Evaluating Model Performance with Hard-Swish Activation Function Adjustments
by: Pydimarry, Sai Abhinav, et al.
Published: (2024)

STAF: Sinusoidal Trainable Activation Functions for Implicit Neural Representation
by: Morsali, Alireza, et al.
Published: (2025)

Ask Me Again Differently: GRAS for Measuring Bias in Vision Language Models on Gender, Race, Age, and Skin Tone
by: Malik, Shaivi, et al.
Published: (2025)

VisionTrap: Unanswerable Questions On Visual Data
by: Saadat, Asir, et al.
Published: (2025)

YoloTag: Vision-based Robust UAV Navigation with Fiducial Markers
by: Raxit, Sourav, et al.
Published: (2024)

DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing
by: Dong, Zhenyuan, et al.
Published: (2024)

Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy
by: Azizi, Seyedarmin, et al.
Published: (2024)

ConMamba: Contrastive Vision Mamba for Plant Disease Detection
by: Mamun, Abdullah Al, et al.
Published: (2025)

GraphFusion3D: Dynamic Graph Attention Convolution with Adaptive Cross-Modal Transformer for 3D Object Detection
by: Mia, Md Sohag, et al.
Published: (2025)

VISIONLOGIC: From Neuron Activations to Causally Grounded Concept Rules for Vision Models
by: Geng, Chuqin, et al.
Published: (2025)

GenFormer -- Generated Images are All You Need to Improve Robustness of Transformers on Small Datasets
by: Oehri, Sven, et al.
Published: (2024)

Activation Quantization of Vision Encoders Needs Prefixing Registers
by: Kim, Seunghyeon, et al.
Published: (2025)

VLM-KG: Multimodal Radiology Knowledge Graph Generation
by: Abdullah, Abdullah, et al.
Published: (2025)

PSMamba: Progressive Self-supervised Vision Mamba for Plant Disease Recognition
by: Mamun, Abdullah Al, et al.
Published: (2025)

ViTs are Everywhere: A Comprehensive Study Showcasing Vision Transformers in Different Domain
by: Mia, Md Sohag, et al.
Published: (2023)

FINER++: Building a Family of Variable-periodic Functions for Activating Implicit Neural Representation
by: Zhu, Hao, et al.
Published: (2024)