:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Espinosa, Miguel, Yang, Chenhongyi, Ericsson, Linus, McDonagh, Steven, Crowley, Elliot J.
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2411.15288
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

No time to train! Training-Free Reference-Based Instance Segmentation
by: Espinosa, Miguel, et al.
Published: (2025)

einspace: Searching for Neural Architectures from Fundamental Operations
by: Ericsson, Linus, et al.
Published: (2024)

PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
by: Yang, Chenhongyi, et al.
Published: (2024)

Label-Efficient Object Detection via Region Proposal Network Pre-Training
by: Dong, Nanqing, et al.
Published: (2022)

Plug and Play Active Learning for Object Detection
by: Yang, Chenhongyi, et al.
Published: (2022)

WidthFormer: Toward Efficient Transformer-based BEV View Transformation
by: Yang, Chenhongyi, et al.
Published: (2024)

Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models
by: Stogiannidis, Ilias, et al.
Published: (2025)

EgoPoseFormer: A Simple Baseline for Stereo Egocentric 3D Human Pose Estimation
by: Yang, Chenhongyi, et al.
Published: (2024)

Improving Object Detection via Local-global Contrastive Learning
by: Triantafyllidou, Danai, et al.
Published: (2024)

Concept-based Adversarial Attack: a Probabilistic Perspective
by: Zhang, Andi, et al.
Published: (2025)

GaussianHeadTalk: Wobble-Free 3D Talking Heads with Audio Driven Gaussian Splatting
by: Agarwal, Madhav, et al.
Published: (2025)

Why Do Vision Language Models Struggle To Recognize Human Emotions?
by: Agarwal, Madhav, et al.
Published: (2026)

Erase to Enhance: Data-Efficient Machine Unlearning in MRI Reconstruction
by: Xue, Yuyang, et al.
Published: (2024)

CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs
by: Dutt, Raman, et al.
Published: (2025)

View-Consistent Diffusion Representations for 3D-Consistent Video Generation
by: Danier, Duolikun, et al.
Published: (2025)

Rethinking Inter-LoRA Orthogonality in Adapter Merging: Insights from Orthogonal Monte Carlo Dropout
by: Zhang, Andi, et al.
Published: (2025)

COP-GEN-Beta: Unified Generative Modelling of COPernicus Imagery Thumbnails
by: Espinosa, Miguel, et al.
Published: (2025)

COP-GEN: Latent Diffusion Transformer for Copernicus Earth Observation Data
by: Espinosa, Miguel, et al.
Published: (2026)

MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation
by: Tudosiu, Petru-Daniel, et al.
Published: (2024)

SWiFT: Soft-Mask Weight Fine-tuning for Bias Mitigation
by: Yan, Junyu, et al.
Published: (2025)

CRCE: Coreference-Retention Concept Erasure in Text-to-Image Diffusion Models
by: Xue, Yuyang, et al.
Published: (2025)

Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
by: Dutt, Raman, et al.
Published: (2025)

Evolutionary Architecture Search through Grammar-Based Sequence Alignment
by: Martín, Adri Gómez, et al.
Published: (2025)

Beyond Pixel Histories: World Models with Persistent 3D State
by: Garcin, Samuel, et al.
Published: (2026)

Unlocking the Potential of Weakly Labeled Data: A Co-Evolutionary Learning Framework for Abnormality Detection and Report Generation
by: Sun, Jinghan, et al.
Published: (2024)

Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity
by: Dutt, Raman, et al.
Published: (2023)

Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks
by: Ren, Tianhe, et al.
Published: (2024)

A Backbone for Long-Horizon Robot Task Understanding
by: Chen, Xiaoshuai, et al.
Published: (2024)

VRP-SAM: SAM with Visual Reference Prompt
by: Sun, Yanpeng, et al.
Published: (2024)

A Shift in Perspective on Causality in Domain Generalization
by: Machlanski, Damian, et al.
Published: (2025)

From SAM to SAM 2: Exploring Improvements in Meta's Segment Anything Model
by: Geetha, Athulya Sundaresan, et al.
Published: (2024)

TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks
by: Yu, Yang, et al.
Published: (2024)

SAM-Guided Masked Token Prediction for 3D Scene Understanding
by: Chen, Zhimin, et al.
Published: (2024)

SAILViT: Towards Robust and Generalizable Visual Backbones for MLLMs via Gradual Feature Refinement
by: Yin, Weijie, et al.
Published: (2025)

ProSAM: Enhancing the Robustness of SAM-based Visual Reference Segmentation with Probabilistic Prompts
by: Wang, Xiaoqi, et al.
Published: (2025)

VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
by: Chu, Xiangxiang, et al.
Published: (2024)

VersaViT: Enhancing MLLM Vision Backbones via Task-Guided Optimization
by: Liu, Yikun, et al.
Published: (2026)

SAM$^{*}$: Task-Adaptive SAM with Physics-Guided Rewards
by: Barakati, Kamyar, et al.
Published: (2025)

VirDA: Reusing Backbone for Unsupervised Domain Adaptation with Visual Reprogramming
by: Nguyen, Duy, et al.
Published: (2025)

Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning
by: Li, Siyuan, et al.
Published: (2024)