Saved in:
| Main Authors: | Nottebaum, Moritz, Dunnhofer, Matteo, Micheloni, Christian |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.03460 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones
by: Nottebaum, Moritz, et al.
Published: (2026)
by: Nottebaum, Moritz, et al.
Published: (2026)
CPUBone: Efficient Vision Backbone Design for Devices with Low Parallelization Capabilities
by: Nottebaum, Moritz, et al.
Published: (2026)
by: Nottebaum, Moritz, et al.
Published: (2026)
Is Tracking really more challenging in First Person Egocentric Vision?
by: Dunnhofer, Matteo, et al.
Published: (2025)
by: Dunnhofer, Matteo, et al.
Published: (2025)
Better, But Not Sufficient: Testing Video ANNs Against Macaque IT Dynamics
by: Dunnhofer, Matteo, et al.
Published: (2026)
by: Dunnhofer, Matteo, et al.
Published: (2026)
Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory
by: Manigrasso, Zaira, et al.
Published: (2024)
by: Manigrasso, Zaira, et al.
Published: (2024)
Tracking Skiers from the Top to the Bottom
by: Dunnhofer, Matteo, et al.
Published: (2023)
by: Dunnhofer, Matteo, et al.
Published: (2023)
SkelMamba: A State Space Model for Efficient Skeleton Action Recognition of Neurological Disorders
by: Martinel, Niki, et al.
Published: (2024)
by: Martinel, Niki, et al.
Published: (2024)
H3D-MarNet: Wavelet-Guided Dual-Path Learning for Metal Artifact Suppression and CT Modality Transformation for Radiotherapy Workflows
by: Rehman, Mubashara, et al.
Published: (2026)
by: Rehman, Mubashara, et al.
Published: (2026)
ReMAR-DS: Recalibrated Feature Learning for Metal Artifact Reduction and CT Domain Transformation
by: Rehman, Mubashara, et al.
Published: (2025)
by: Rehman, Mubashara, et al.
Published: (2025)
ModalFormer: Multimodal Transformer for Low-Light Image Enhancement
by: Brateanu, Alexandru, et al.
Published: (2025)
by: Brateanu, Alexandru, et al.
Published: (2025)
WidthFormer: Toward Efficient Transformer-based BEV View Transformation
by: Yang, Chenhongyi, et al.
Published: (2024)
by: Yang, Chenhongyi, et al.
Published: (2024)
Revisiting the Integration of Convolution and Attention for Vision Backbone
by: Zhu, Lei, et al.
Published: (2024)
by: Zhu, Lei, et al.
Published: (2024)
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention
by: He, Chenhang, et al.
Published: (2024)
by: He, Chenhang, et al.
Published: (2024)
MixFormerV2: Efficient Fully Transformer Tracking
by: Cui, Yutao, et al.
Published: (2023)
by: Cui, Yutao, et al.
Published: (2023)
FakeFormer: Efficient Vulnerability-Driven Transformers for Generalisable Deepfake Detection
by: Nguyen, Dat, et al.
Published: (2024)
by: Nguyen, Dat, et al.
Published: (2024)
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
by: Li, Kunchang, et al.
Published: (2022)
by: Li, Kunchang, et al.
Published: (2022)
JAQ: Joint Efficient Architecture Design and Low-Bit Quantization with Hardware-Software Co-Exploration
by: Wang, Mingzi, et al.
Published: (2025)
by: Wang, Mingzi, et al.
Published: (2025)
Convolutional Initialization for Data-Efficient Vision Transformers
by: Zheng, Jianqiao, et al.
Published: (2024)
by: Zheng, Jianqiao, et al.
Published: (2024)
RapidNet: Multi-Level Dilated Convolution Based Mobile Backbone
by: Munir, Mustafa, et al.
Published: (2024)
by: Munir, Mustafa, et al.
Published: (2024)
MambaVision: A Hybrid Mamba-Transformer Vision Backbone
by: Hatamizadeh, Ali, et al.
Published: (2024)
by: Hatamizadeh, Ali, et al.
Published: (2024)
Activation-Free Backbones for Image Recognition: Polynomial Alternatives within MetaFormer-Style Vision Models
by: Wang, Jeffrey, et al.
Published: (2026)
by: Wang, Jeffrey, et al.
Published: (2026)
Efficient Quantum Convolutional Neural Networks for Image Classification: Overcoming Hardware Constraints
by: Röseler, Peter, et al.
Published: (2025)
by: Röseler, Peter, et al.
Published: (2025)
SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation
by: Perera, Shehan, et al.
Published: (2024)
by: Perera, Shehan, et al.
Published: (2024)
AnchorFormer: Differentiable Anchor Attention for Efficient Vision Transformer
by: Shan, Jiquan, et al.
Published: (2025)
by: Shan, Jiquan, et al.
Published: (2025)
Vision Backbone Efficient Selection for Image Classification in Low-Data Regimes
by: Guerin, Joris, et al.
Published: (2024)
by: Guerin, Joris, et al.
Published: (2024)
LoFormer: Local Frequency Transformer for Image Deblurring
by: Mao, Xintian, et al.
Published: (2024)
by: Mao, Xintian, et al.
Published: (2024)
CountFormer: Multi-View Crowd Counting Transformer
by: Mo, Hong, et al.
Published: (2024)
by: Mo, Hong, et al.
Published: (2024)
GridFormer: Point-Grid Transformer for Surface Reconstruction
by: Li, Shengtao, et al.
Published: (2024)
by: Li, Shengtao, et al.
Published: (2024)
GeoFormer: A Multi-Polygon Segmentation Transformer
by: Khomiakov, Maxim, et al.
Published: (2024)
by: Khomiakov, Maxim, et al.
Published: (2024)
MonoFormer: One Transformer for Both Diffusion and Autoregression
by: Zhao, Chuyang, et al.
Published: (2024)
by: Zhao, Chuyang, et al.
Published: (2024)
AesFormer: Transform Everyday Photos into Beautiful Memories
by: Du, Tianxiang, et al.
Published: (2026)
by: Du, Tianxiang, et al.
Published: (2026)
Low-Bit Integerization of Vision Transformers using Operand Reordering for Efficient Hardware
by: Lin, Ching-Yi, et al.
Published: (2025)
by: Lin, Ching-Yi, et al.
Published: (2025)
KnapFormer: An Online Load Balancer for Efficient Diffusion Transformers Training
by: Zhang, Kai, et al.
Published: (2025)
by: Zhang, Kai, et al.
Published: (2025)
SLAM-Former: Putting SLAM into One Transformer
by: Yuan, Yijun, et al.
Published: (2025)
by: Yuan, Yijun, et al.
Published: (2025)
Low-Cost Self-Ensembles Based on Multi-Branch Transformation and Grouped Convolution
by: Lee, Hojung, et al.
Published: (2024)
by: Lee, Hojung, et al.
Published: (2024)
Rethinking Backbone Design for Lightweight 3D Object Detection in LiDAR
by: Chandorkar, Adwait, et al.
Published: (2025)
by: Chandorkar, Adwait, et al.
Published: (2025)
A Comparative Study of Image Restoration Networks for General Backbone Network Design
by: Chen, Xiangyu, et al.
Published: (2023)
by: Chen, Xiangyu, et al.
Published: (2023)
Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets
by: Zhang, Tianxiao, et al.
Published: (2024)
by: Zhang, Tianxiao, et al.
Published: (2024)
CompetitorFormer: Competitor Transformer for 3D Instance Segmentation
by: Wang, Duanchu, et al.
Published: (2024)
by: Wang, Duanchu, et al.
Published: (2024)
SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition
by: Do, Jeonghyeok, et al.
Published: (2024)
by: Do, Jeonghyeok, et al.
Published: (2024)
Similar Items
-
Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones
by: Nottebaum, Moritz, et al.
Published: (2026) -
CPUBone: Efficient Vision Backbone Design for Devices with Low Parallelization Capabilities
by: Nottebaum, Moritz, et al.
Published: (2026) -
Is Tracking really more challenging in First Person Egocentric Vision?
by: Dunnhofer, Matteo, et al.
Published: (2025) -
Better, But Not Sufficient: Testing Video ANNs Against Macaque IT Dynamics
by: Dunnhofer, Matteo, et al.
Published: (2026) -
Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory
by: Manigrasso, Zaira, et al.
Published: (2024)