:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Nottebaum, Moritz, Dunnhofer, Matteo, Micheloni, Christian
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2409.03460
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones
by: Nottebaum, Moritz, et al.
Published: (2026)

CPUBone: Efficient Vision Backbone Design for Devices with Low Parallelization Capabilities
by: Nottebaum, Moritz, et al.
Published: (2026)

Is Tracking really more challenging in First Person Egocentric Vision?
by: Dunnhofer, Matteo, et al.
Published: (2025)

Better, But Not Sufficient: Testing Video ANNs Against Macaque IT Dynamics
by: Dunnhofer, Matteo, et al.
Published: (2026)

Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory
by: Manigrasso, Zaira, et al.
Published: (2024)

Tracking Skiers from the Top to the Bottom
by: Dunnhofer, Matteo, et al.
Published: (2023)

SkelMamba: A State Space Model for Efficient Skeleton Action Recognition of Neurological Disorders
by: Martinel, Niki, et al.
Published: (2024)

H3D-MarNet: Wavelet-Guided Dual-Path Learning for Metal Artifact Suppression and CT Modality Transformation for Radiotherapy Workflows
by: Rehman, Mubashara, et al.
Published: (2026)

ReMAR-DS: Recalibrated Feature Learning for Metal Artifact Reduction and CT Domain Transformation
by: Rehman, Mubashara, et al.
Published: (2025)

ModalFormer: Multimodal Transformer for Low-Light Image Enhancement
by: Brateanu, Alexandru, et al.
Published: (2025)

WidthFormer: Toward Efficient Transformer-based BEV View Transformation
by: Yang, Chenhongyi, et al.
Published: (2024)

Revisiting the Integration of Convolution and Attention for Vision Backbone
by: Zhu, Lei, et al.
Published: (2024)

ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention
by: He, Chenhang, et al.
Published: (2024)

MixFormerV2: Efficient Fully Transformer Tracking
by: Cui, Yutao, et al.
Published: (2023)

FakeFormer: Efficient Vulnerability-Driven Transformers for Generalisable Deepfake Detection
by: Nguyen, Dat, et al.
Published: (2024)

UniFormer: Unifying Convolution and Self-attention for Visual Recognition
by: Li, Kunchang, et al.
Published: (2022)

JAQ: Joint Efficient Architecture Design and Low-Bit Quantization with Hardware-Software Co-Exploration
by: Wang, Mingzi, et al.
Published: (2025)

Convolutional Initialization for Data-Efficient Vision Transformers
by: Zheng, Jianqiao, et al.
Published: (2024)

RapidNet: Multi-Level Dilated Convolution Based Mobile Backbone
by: Munir, Mustafa, et al.
Published: (2024)

MambaVision: A Hybrid Mamba-Transformer Vision Backbone
by: Hatamizadeh, Ali, et al.
Published: (2024)

Activation-Free Backbones for Image Recognition: Polynomial Alternatives within MetaFormer-Style Vision Models
by: Wang, Jeffrey, et al.
Published: (2026)

Efficient Quantum Convolutional Neural Networks for Image Classification: Overcoming Hardware Constraints
by: Röseler, Peter, et al.
Published: (2025)

SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation
by: Perera, Shehan, et al.
Published: (2024)

AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer
by: Shan, Jiquan, et al.
Published: (2025)

Vision Backbone Efficient Selection for Image Classification in Low-Data Regimes
by: Guerin, Joris, et al.
Published: (2024)

LoFormer: Local Frequency Transformer for Image Deblurring
by: Mao, Xintian, et al.
Published: (2024)

CountFormer: Multi-View Crowd Counting Transformer
by: Mo, Hong, et al.
Published: (2024)

GridFormer: Point-Grid Transformer for Surface Reconstruction
by: Li, Shengtao, et al.
Published: (2024)

GeoFormer: A Multi-Polygon Segmentation Transformer
by: Khomiakov, Maxim, et al.
Published: (2024)

MonoFormer: One Transformer for Both Diffusion and Autoregression
by: Zhao, Chuyang, et al.
Published: (2024)

AesFormer: Transform Everyday Photos into Beautiful Memories
by: Du, Tianxiang, et al.
Published: (2026)

Low-Bit Integerization of Vision Transformers using Operand Reordering for Efficient Hardware
by: Lin, Ching-Yi, et al.
Published: (2025)

KnapFormer: An Online Load Balancer for Efficient Diffusion Transformers Training
by: Zhang, Kai, et al.
Published: (2025)

SLAM-Former: Putting SLAM into One Transformer
by: Yuan, Yijun, et al.
Published: (2025)

Low-Cost Self-Ensembles Based on Multi-Branch Transformation and Grouped Convolution
by: Lee, Hojung, et al.
Published: (2024)

Rethinking Backbone Design for Lightweight 3D Object Detection in LiDAR
by: Chandorkar, Adwait, et al.
Published: (2025)

A Comparative Study of Image Restoration Networks for General Backbone Network Design
by: Chen, Xiangyu, et al.
Published: (2023)

Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets
by: Zhang, Tianxiao, et al.
Published: (2024)

CompetitorFormer: Competitor Transformer for 3D Instance Segmentation
by: Wang, Duanchu, et al.
Published: (2024)

SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition
by: Do, Jeonghyeok, et al.
Published: (2024)