:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Qin, An, Dongsheng, Xiao, Tianjun, He, Tong, Tang, Qingming, Wu, Ying Nian, Tighe, Joseph, Xing, Yifan, Soatto, Stefano
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2305.12039
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Threshold-Consistent Margin Loss for Open-World Deep Metric Learning
by: Zhang, Qin, et al.
Published: (2023)

VideoSAM: Open-World Video Segmentation
by: Guo, Pinxue, et al.
Published: (2024)

DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models
by: Kim, Sungnyun, et al.
Published: (2024)

EOL: Transductive Few-Shot Open-Set Recognition by Enhancing Outlier Logits
by: Ochal, Mateusz, et al.
Published: (2024)

Open-World Dynamic Prompt and Continual Visual Representation Learning
by: Kim, Youngeun, et al.
Published: (2024)

OpenVIS: Open-vocabulary Video Instance Segmentation
by: Guo, Pinxue, et al.
Published: (2023)

Diffeomorphic Template Registration for Atmospheric Turbulence Mitigation
by: Lao, Dong, et al.
Published: (2024)

Open World MRI Reconstruction with Bias-Calibrated Adaptation
by: Liu, Jiyao, et al.
Published: (2026)

Hawk: Learning to Understand Open-World Video Anomalies
by: Tang, Jiaqi, et al.
Published: (2024)

Generate, Transduct, Adapt: Iterative Transduction with VLMs
by: Saha, Oindrila, et al.
Published: (2025)

Learning to Generalize without Bias for Open-Vocabulary Action Recognition
by: Yu, Yating, et al.
Published: (2025)

Rethinking The Training And Evaluation of Rich-Context Layout-to-Image Generation
by: Cheng, Jiaxin, et al.
Published: (2024)

Decorrelating Structure via Adapters Makes Ensemble Learning Practical for Semi-supervised Learning
by: Wu, Jiaqi, et al.
Published: (2024)

Sub-token ViT Embedding via Stochastic Resonance Transformers
by: Lao, Dong, et al.
Published: (2023)

Towards Open-World Gesture Recognition
by: Shen, Junxiao, et al.
Published: (2024)

Training Data Protection with Compositional Diffusion Models
by: Golatkar, Aditya, et al.
Published: (2023)

Descriminative-Generative Custom Tokens for Vision-Language Models
by: Perera, Pramuditha, et al.
Published: (2025)

NeRF-Insert: 3D Local Editing with Multimodal Control Signals
by: Sabat, Benet Oriol, et al.
Published: (2024)

SDVPT: Semantic-Driven Visual Prompt Tuning for Open-World Object Counting
by: Zhao, Yiming, et al.
Published: (2025)

Unlocking Transfer Learning for Open-World Few-Shot Recognition
by: Kim, Byeonggeun, et al.
Published: (2024)

AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation
by: Wu, Yangchao, et al.
Published: (2023)

Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity
by: Xu, Zhenlin, et al.
Published: (2023)

Human Activity Recognition in an Open World
by: Prijatelj, Derek S., et al.
Published: (2022)

Beyond 3D VQAs: Injecting 3D Spatial Priors into Vision-Language Models for Enhanced Geometric Reasoning
by: Yeh, Chun-Hsiao, et al.
Published: (2026)

Non-autoregressive Sequence-to-Sequence Vision-Language Models
by: Shi, Kunyu, et al.
Published: (2024)

Uncertainty-aware Long-tailed Weights Model the Utility of Pseudo-labels for Semi-supervised Learning
by: Wu, Jiaqi, et al.
Published: (2025)

CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation
by: Pei, Jialun, et al.
Published: (2023)

DualMem: Bypassing the Objectness Bottleneck for Calibrated Unknown-Stream Filtering in Open-World Object Detection
by: Xiao, Yingjun, et al.
Published: (2026)

Bridging Coarse and Fine Recognition: A Hybrid Approach for Open-Ended Multi-Granularity Object Recognition in Interactive Educational Games
by: Yi, Hanling, et al.
Published: (2026)

Dual-Imbalance Continual Learning for Real-World Food Recognition
by: Zhang, Xiaoyan, et al.
Published: (2026)

Divided Attention: Unsupervised Multi-Object Discovery with Contextually Separated Slots
by: Lao, Dong, et al.
Published: (2023)

Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes
by: Tan, Jing, et al.
Published: (2026)

Transductive One-Shot Learning Meet Subspace Decomposition
by: Stein, Kyle, et al.
Published: (2025)

TLRN: Temporal Latent Residual Networks For Large Deformation Image Registration
by: Wu, Nian, et al.
Published: (2024)

Hallucination of Multimodal Large Language Models: A Survey
by: Bai, Zechen, et al.
Published: (2024)

Boosting Open Set Recognition Performance through Modulated Representation Learning
by: Kundu, Amit Kumar, et al.
Published: (2025)

Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video
by: Wang, Yifan, et al.
Published: (2026)

Open-Vocabulary Segmentation with Semantic-Assisted Calibration
by: Liu, Yong, et al.
Published: (2023)

Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles
by: Lao, Dong, et al.
Published: (2025)

Learning Geodesics of Geometric Shape Deformations From Images
by: Wu, Nian, et al.
Published: (2024)