:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Andy, Durrant, Aiden, Markovic, Milan, Huang, Tianjin, Kundu, Souvik, Chen, Tianlong, Yin, Lu, Leontidis, Georgios
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2411.13545
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

HiAP: A Multi-Granular Stochastic Auto-Pruning Framework for Vision Transformers
by: Li, Andy, et al.
Published: (2026)

S-JEA: Stacked Joint Embedding Architectures for Self-Supervised Visual Representation Learning
by: Manová, Alžběta, et al.
Published: (2023)

Capsule Network Projectors are Equivariant and Invariant Learners
by: Everett, Miles, et al.
Published: (2024)

EquiCaps: Predictor-Free Pose-Aware Pre-Trained Capsule Networks
by: Konstantinou, Athinoulla, et al.
Published: (2025)

Trick-GS: A Balanced Bag of Tricks for Efficient Gaussian Splatting
by: Armagan, Anil, et al.
Published: (2025)

Bag of Tricks to Boost Adversarial Transferability
by: Zhang, Zeliang, et al.
Published: (2024)

Rethinking Pruning for Vision-Language Models: Strategies for Effective Sparsity and Performance Restoration
by: He, Shwai, et al.
Published: (2024)

(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
by: Huang, Tianjin, et al.
Published: (2024)

LeOCLR: Leveraging Original Images for Contrastive Learning of Visual Representations
by: Alkhalefi, Mohammad, et al.
Published: (2024)

ProtoCaps: A Fast and Non-Iterative Capsule Network Routing Method
by: Everett, Miles, et al.
Published: (2023)

Masked Capsule Autoencoders
by: Everett, Miles, et al.
Published: (2024)

Optimizing Multispectral Object Detection: A Bag of Tricks and Comprehensive Benchmarks
by: Zhou, Chen, et al.
Published: (2024)

PP-DocBee: Improving Multimodal Document Understanding Through a Bag of Tricks
by: Ni, Feng, et al.
Published: (2025)

Semantic Positive Pairs for Enhancing Visual Representation Learning of Instance Discrimination Methods
by: Alkhalefi, Mohammad, et al.
Published: (2023)

Agent-Based Post-Hoc Correction of Agricultural Yield Forecasts
by: Beddows, Matthew, et al.
Published: (2026)

A Bag of Tricks for Few-Shot Class-Incremental Learning
by: Roy, Shuvendu, et al.
Published: (2024)

A Bag of Tricks for Efficient Implicit Neural Point Clouds
by: Hahlbohm, Florian, et al.
Published: (2025)

RedVTP: Training-Free Acceleration of Diffusion Vision-Language Models Inference via Masked Token-Guided Visual Token Pruning
by: Xu, Jingqi, et al.
Published: (2025)

MedFM-Robust: Benchmarking Robustness of Medical Foundation Models
by: Cui, Xiangxiang, et al.
Published: (2026)

Monkey Transfer Learning Can Improve Human Pose Estimation
by: Scott, Bradley, et al.
Published: (2024)

Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective
by: Jin, Can, et al.
Published: (2023)

Multimodal Federated Learning With Missing Modalities through Feature Imputation Network
by: Poudel, Pranav, et al.
Published: (2025)

Test-time Sparsity for Extreme Fast Action Diffusion
by: Ji, Kangye, et al.
Published: (2026)

IndustryEQA: Pushing the Frontiers of Embodied Question Answering in Industrial Scenarios
by: Li, Yifan, et al.
Published: (2025)

Getting the Numbers Right$\unicode{x2014}$Modelling Multi-Class Object Counting in Dense and Varied Scenes
by: O'Reilly, Villanelle, et al.
Published: (2025)

QVGen: Pushing the Limit of Quantized Video Generative Models
by: Huang, Yushi, et al.
Published: (2025)

Sphinx: Efficiently Serving Novel View Synthesis using Regression-Guided Selective Refinement
by: Xia, Yuchen, et al.
Published: (2025)

Dual-Stage Invariant Continual Learning under Extreme Visual Sparsity
by: Zhang, Rangya, et al.
Published: (2026)

BlinkFlow: A Dataset to Push the Limits of Event-based Optical Flow Estimation
by: Li, Yijin, et al.
Published: (2023)

BiDM: Pushing the Limit of Quantization for Diffusion Models
by: Zheng, Xingyu, et al.
Published: (2024)

Bag of Bags: Adaptive Visual Vocabularies for Genizah Join Image Retrieval
by: Gogawale, Sharva, et al.
Published: (2026)

CLIP Tricks You: Training-free Token Pruning for Efficient Pixel Grounding in Large VIsion-Language Models
by: Lee, Sangin, et al.
Published: (2026)

GenQ: Quantization in Low Data Regimes with Generative Synthetic Data
by: Li, Yuhang, et al.
Published: (2023)

LANTERN++: Enhancing Relaxed Speculative Decoding with Static Tree Drafting for Visual Auto-regressive Models
by: Park, Sihwan, et al.
Published: (2025)

SSR: Pushing the Limit of Spatial Intelligence with Structured Scene Reasoning
by: Zhang, Yi, et al.
Published: (2026)

KAN-HyperpointNet for Point Cloud Sequence-Based 3D Human Action Recognition
by: Chen, Zhaoyu, et al.
Published: (2024)

VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm
by: Wu, Zhenkai, et al.
Published: (2025)

CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs
by: Ramachandran, Akshat, et al.
Published: (2024)

SCOPE: Scene-Contextualized Incremental Few-Shot 3D Segmentation
by: Thengane, Vishal, et al.
Published: (2026)

Are Sparse Neural Networks Better Hard Sample Learners?
by: Xiao, Qiao, et al.
Published: (2024)