Saved in:
| Main Authors: | Espinosa, Miguel, Yang, Chenhongyi, Ericsson, Linus, McDonagh, Steven, Crowley, Elliot J. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.15288 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
No time to train! Training-Free Reference-Based Instance Segmentation
by: Espinosa, Miguel, et al.
Published: (2025)
by: Espinosa, Miguel, et al.
Published: (2025)
einspace: Searching for Neural Architectures from Fundamental Operations
by: Ericsson, Linus, et al.
Published: (2024)
by: Ericsson, Linus, et al.
Published: (2024)
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
by: Yang, Chenhongyi, et al.
Published: (2024)
by: Yang, Chenhongyi, et al.
Published: (2024)
Label-Efficient Object Detection via Region Proposal Network Pre-Training
by: Dong, Nanqing, et al.
Published: (2022)
by: Dong, Nanqing, et al.
Published: (2022)
Plug and Play Active Learning for Object Detection
by: Yang, Chenhongyi, et al.
Published: (2022)
by: Yang, Chenhongyi, et al.
Published: (2022)
WidthFormer: Toward Efficient Transformer-based BEV View Transformation
by: Yang, Chenhongyi, et al.
Published: (2024)
by: Yang, Chenhongyi, et al.
Published: (2024)
Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models
by: Stogiannidis, Ilias, et al.
Published: (2025)
by: Stogiannidis, Ilias, et al.
Published: (2025)
EgoPoseFormer: A Simple Baseline for Stereo Egocentric 3D Human Pose Estimation
by: Yang, Chenhongyi, et al.
Published: (2024)
by: Yang, Chenhongyi, et al.
Published: (2024)
Improving Object Detection via Local-global Contrastive Learning
by: Triantafyllidou, Danai, et al.
Published: (2024)
by: Triantafyllidou, Danai, et al.
Published: (2024)
Concept-based Adversarial Attack: a Probabilistic Perspective
by: Zhang, Andi, et al.
Published: (2025)
by: Zhang, Andi, et al.
Published: (2025)
GaussianHeadTalk: Wobble-Free 3D Talking Heads with Audio Driven Gaussian Splatting
by: Agarwal, Madhav, et al.
Published: (2025)
by: Agarwal, Madhav, et al.
Published: (2025)
Why Do Vision Language Models Struggle To Recognize Human Emotions?
by: Agarwal, Madhav, et al.
Published: (2026)
by: Agarwal, Madhav, et al.
Published: (2026)
Erase to Enhance: Data-Efficient Machine Unlearning in MRI Reconstruction
by: Xue, Yuyang, et al.
Published: (2024)
by: Xue, Yuyang, et al.
Published: (2024)
CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs
by: Dutt, Raman, et al.
Published: (2025)
by: Dutt, Raman, et al.
Published: (2025)
View-Consistent Diffusion Representations for 3D-Consistent Video Generation
by: Danier, Duolikun, et al.
Published: (2025)
by: Danier, Duolikun, et al.
Published: (2025)
Rethinking Inter-LoRA Orthogonality in Adapter Merging: Insights from Orthogonal Monte Carlo Dropout
by: Zhang, Andi, et al.
Published: (2025)
by: Zhang, Andi, et al.
Published: (2025)
COP-GEN-Beta: Unified Generative Modelling of COPernicus Imagery Thumbnails
by: Espinosa, Miguel, et al.
Published: (2025)
by: Espinosa, Miguel, et al.
Published: (2025)
COP-GEN: Latent Diffusion Transformer for Copernicus Earth Observation Data
by: Espinosa, Miguel, et al.
Published: (2026)
by: Espinosa, Miguel, et al.
Published: (2026)
MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation
by: Tudosiu, Petru-Daniel, et al.
Published: (2024)
by: Tudosiu, Petru-Daniel, et al.
Published: (2024)
SWiFT: Soft-Mask Weight Fine-tuning for Bias Mitigation
by: Yan, Junyu, et al.
Published: (2025)
by: Yan, Junyu, et al.
Published: (2025)
CRCE: Coreference-Retention Concept Erasure in Text-to-Image Diffusion Models
by: Xue, Yuyang, et al.
Published: (2025)
by: Xue, Yuyang, et al.
Published: (2025)
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities
by: Dutt, Raman, et al.
Published: (2025)
by: Dutt, Raman, et al.
Published: (2025)
Evolutionary Architecture Search through Grammar-Based Sequence Alignment
by: Martín, Adri Gómez, et al.
Published: (2025)
by: Martín, Adri Gómez, et al.
Published: (2025)
Beyond Pixel Histories: World Models with Persistent 3D State
by: Garcin, Samuel, et al.
Published: (2026)
by: Garcin, Samuel, et al.
Published: (2026)
Unlocking the Potential of Weakly Labeled Data: A Co-Evolutionary Learning Framework for Abnormality Detection and Report Generation
by: Sun, Jinghan, et al.
Published: (2024)
by: Sun, Jinghan, et al.
Published: (2024)
Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity
by: Dutt, Raman, et al.
Published: (2023)
by: Dutt, Raman, et al.
Published: (2023)
Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks
by: Ren, Tianhe, et al.
Published: (2024)
by: Ren, Tianhe, et al.
Published: (2024)
A Backbone for Long-Horizon Robot Task Understanding
by: Chen, Xiaoshuai, et al.
Published: (2024)
by: Chen, Xiaoshuai, et al.
Published: (2024)
VRP-SAM: SAM with Visual Reference Prompt
by: Sun, Yanpeng, et al.
Published: (2024)
by: Sun, Yanpeng, et al.
Published: (2024)
A Shift in Perspective on Causality in Domain Generalization
by: Machlanski, Damian, et al.
Published: (2025)
by: Machlanski, Damian, et al.
Published: (2025)
From SAM to SAM 2: Exploring Improvements in Meta's Segment Anything Model
by: Geetha, Athulya Sundaresan, et al.
Published: (2024)
by: Geetha, Athulya Sundaresan, et al.
Published: (2024)
TS-SAM: Fine-Tuning Segment-Anything Model for Downstream Tasks
by: Yu, Yang, et al.
Published: (2024)
by: Yu, Yang, et al.
Published: (2024)
SAM-Guided Masked Token Prediction for 3D Scene Understanding
by: Chen, Zhimin, et al.
Published: (2024)
by: Chen, Zhimin, et al.
Published: (2024)
SAILViT: Towards Robust and Generalizable Visual Backbones for MLLMs via Gradual Feature Refinement
by: Yin, Weijie, et al.
Published: (2025)
by: Yin, Weijie, et al.
Published: (2025)
ProSAM: Enhancing the Robustness of SAM-based Visual Reference Segmentation with Probabilistic Prompts
by: Wang, Xiaoqi, et al.
Published: (2025)
by: Wang, Xiaoqi, et al.
Published: (2025)
VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
by: Chu, Xiangxiang, et al.
Published: (2024)
by: Chu, Xiangxiang, et al.
Published: (2024)
VersaViT: Enhancing MLLM Vision Backbones via Task-Guided Optimization
by: Liu, Yikun, et al.
Published: (2026)
by: Liu, Yikun, et al.
Published: (2026)
SAM$^{*}$: Task-Adaptive SAM with Physics-Guided Rewards
by: Barakati, Kamyar, et al.
Published: (2025)
by: Barakati, Kamyar, et al.
Published: (2025)
VirDA: Reusing Backbone for Unsupervised Domain Adaptation with Visual Reprogramming
by: Nguyen, Duy, et al.
Published: (2025)
by: Nguyen, Duy, et al.
Published: (2025)
Unveiling the Backbone-Optimizer Coupling Bias in Visual Representation Learning
by: Li, Siyuan, et al.
Published: (2024)
by: Li, Siyuan, et al.
Published: (2024)
Similar Items
-
No time to train! Training-Free Reference-Based Instance Segmentation
by: Espinosa, Miguel, et al.
Published: (2025) -
einspace: Searching for Neural Architectures from Fundamental Operations
by: Ericsson, Linus, et al.
Published: (2024) -
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
by: Yang, Chenhongyi, et al.
Published: (2024) -
Label-Efficient Object Detection via Region Proposal Network Pre-Training
by: Dong, Nanqing, et al.
Published: (2022) -
Plug and Play Active Learning for Object Detection
by: Yang, Chenhongyi, et al.
Published: (2022)