Saved in:
| Main Authors: | Zhang, Qin, An, Dongsheng, Xiao, Tianjun, He, Tong, Tang, Qingming, Wu, Ying Nian, Tighe, Joseph, Xing, Yifan, Soatto, Stefano |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2305.12039 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Threshold-Consistent Margin Loss for Open-World Deep Metric Learning
by: Zhang, Qin, et al.
Published: (2023)
by: Zhang, Qin, et al.
Published: (2023)
VideoSAM: Open-World Video Segmentation
by: Guo, Pinxue, et al.
Published: (2024)
by: Guo, Pinxue, et al.
Published: (2024)
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models
by: Kim, Sungnyun, et al.
Published: (2024)
by: Kim, Sungnyun, et al.
Published: (2024)
EOL: Transductive Few-Shot Open-Set Recognition by Enhancing Outlier Logits
by: Ochal, Mateusz, et al.
Published: (2024)
by: Ochal, Mateusz, et al.
Published: (2024)
Open-World Dynamic Prompt and Continual Visual Representation Learning
by: Kim, Youngeun, et al.
Published: (2024)
by: Kim, Youngeun, et al.
Published: (2024)
OpenVIS: Open-vocabulary Video Instance Segmentation
by: Guo, Pinxue, et al.
Published: (2023)
by: Guo, Pinxue, et al.
Published: (2023)
Diffeomorphic Template Registration for Atmospheric Turbulence Mitigation
by: Lao, Dong, et al.
Published: (2024)
by: Lao, Dong, et al.
Published: (2024)
Open World MRI Reconstruction with Bias-Calibrated Adaptation
by: Liu, Jiyao, et al.
Published: (2026)
by: Liu, Jiyao, et al.
Published: (2026)
Hawk: Learning to Understand Open-World Video Anomalies
by: Tang, Jiaqi, et al.
Published: (2024)
by: Tang, Jiaqi, et al.
Published: (2024)
Generate, Transduct, Adapt: Iterative Transduction with VLMs
by: Saha, Oindrila, et al.
Published: (2025)
by: Saha, Oindrila, et al.
Published: (2025)
Learning to Generalize without Bias for Open-Vocabulary Action Recognition
by: Yu, Yating, et al.
Published: (2025)
by: Yu, Yating, et al.
Published: (2025)
Rethinking The Training And Evaluation of Rich-Context Layout-to-Image Generation
by: Cheng, Jiaxin, et al.
Published: (2024)
by: Cheng, Jiaxin, et al.
Published: (2024)
Decorrelating Structure via Adapters Makes Ensemble Learning Practical for Semi-supervised Learning
by: Wu, Jiaqi, et al.
Published: (2024)
by: Wu, Jiaqi, et al.
Published: (2024)
Sub-token ViT Embedding via Stochastic Resonance Transformers
by: Lao, Dong, et al.
Published: (2023)
by: Lao, Dong, et al.
Published: (2023)
Towards Open-World Gesture Recognition
by: Shen, Junxiao, et al.
Published: (2024)
by: Shen, Junxiao, et al.
Published: (2024)
Training Data Protection with Compositional Diffusion Models
by: Golatkar, Aditya, et al.
Published: (2023)
by: Golatkar, Aditya, et al.
Published: (2023)
Descriminative-Generative Custom Tokens for Vision-Language Models
by: Perera, Pramuditha, et al.
Published: (2025)
by: Perera, Pramuditha, et al.
Published: (2025)
NeRF-Insert: 3D Local Editing with Multimodal Control Signals
by: Sabat, Benet Oriol, et al.
Published: (2024)
by: Sabat, Benet Oriol, et al.
Published: (2024)
SDVPT: Semantic-Driven Visual Prompt Tuning for Open-World Object Counting
by: Zhao, Yiming, et al.
Published: (2025)
by: Zhao, Yiming, et al.
Published: (2025)
Unlocking Transfer Learning for Open-World Few-Shot Recognition
by: Kim, Byeonggeun, et al.
Published: (2024)
by: Kim, Byeonggeun, et al.
Published: (2024)
AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation
by: Wu, Yangchao, et al.
Published: (2023)
by: Wu, Yangchao, et al.
Published: (2023)
Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity
by: Xu, Zhenlin, et al.
Published: (2023)
by: Xu, Zhenlin, et al.
Published: (2023)
Human Activity Recognition in an Open World
by: Prijatelj, Derek S., et al.
Published: (2022)
by: Prijatelj, Derek S., et al.
Published: (2022)
Beyond 3D VQAs: Injecting 3D Spatial Priors into Vision-Language Models for Enhanced Geometric Reasoning
by: Yeh, Chun-Hsiao, et al.
Published: (2026)
by: Yeh, Chun-Hsiao, et al.
Published: (2026)
Non-autoregressive Sequence-to-Sequence Vision-Language Models
by: Shi, Kunyu, et al.
Published: (2024)
by: Shi, Kunyu, et al.
Published: (2024)
Uncertainty-aware Long-tailed Weights Model the Utility of Pseudo-labels for Semi-supervised Learning
by: Wu, Jiaqi, et al.
Published: (2025)
by: Wu, Jiaqi, et al.
Published: (2025)
CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation
by: Pei, Jialun, et al.
Published: (2023)
by: Pei, Jialun, et al.
Published: (2023)
DualMem: Bypassing the Objectness Bottleneck for Calibrated Unknown-Stream Filtering in Open-World Object Detection
by: Xiao, Yingjun, et al.
Published: (2026)
by: Xiao, Yingjun, et al.
Published: (2026)
Bridging Coarse and Fine Recognition: A Hybrid Approach for Open-Ended Multi-Granularity Object Recognition in Interactive Educational Games
by: Yi, Hanling, et al.
Published: (2026)
by: Yi, Hanling, et al.
Published: (2026)
Dual-Imbalance Continual Learning for Real-World Food Recognition
by: Zhang, Xiaoyan, et al.
Published: (2026)
by: Zhang, Xiaoyan, et al.
Published: (2026)
Divided Attention: Unsupervised Multi-Object Discovery with Contextually Separated Slots
by: Lao, Dong, et al.
Published: (2023)
by: Lao, Dong, et al.
Published: (2023)
Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes
by: Tan, Jing, et al.
Published: (2026)
by: Tan, Jing, et al.
Published: (2026)
Transductive One-Shot Learning Meet Subspace Decomposition
by: Stein, Kyle, et al.
Published: (2025)
by: Stein, Kyle, et al.
Published: (2025)
TLRN: Temporal Latent Residual Networks For Large Deformation Image Registration
by: Wu, Nian, et al.
Published: (2024)
by: Wu, Nian, et al.
Published: (2024)
Hallucination of Multimodal Large Language Models: A Survey
by: Bai, Zechen, et al.
Published: (2024)
by: Bai, Zechen, et al.
Published: (2024)
Boosting Open Set Recognition Performance through Modulated Representation Learning
by: Kundu, Amit Kumar, et al.
Published: (2025)
by: Kundu, Amit Kumar, et al.
Published: (2025)
Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video
by: Wang, Yifan, et al.
Published: (2026)
by: Wang, Yifan, et al.
Published: (2026)
Open-Vocabulary Segmentation with Semantic-Assisted Calibration
by: Liu, Yong, et al.
Published: (2023)
by: Liu, Yong, et al.
Published: (2023)
Test-Time Defense Against Adversarial Attacks via Stochastic Resonance of Latent Ensembles
by: Lao, Dong, et al.
Published: (2025)
by: Lao, Dong, et al.
Published: (2025)
Learning Geodesics of Geometric Shape Deformations From Images
by: Wu, Nian, et al.
Published: (2024)
by: Wu, Nian, et al.
Published: (2024)
Similar Items
-
Threshold-Consistent Margin Loss for Open-World Deep Metric Learning
by: Zhang, Qin, et al.
Published: (2023) -
VideoSAM: Open-World Video Segmentation
by: Guo, Pinxue, et al.
Published: (2024) -
DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models
by: Kim, Sungnyun, et al.
Published: (2024) -
EOL: Transductive Few-Shot Open-Set Recognition by Enhancing Outlier Logits
by: Ochal, Mateusz, et al.
Published: (2024) -
Open-World Dynamic Prompt and Continual Visual Representation Learning
by: Kim, Youngeun, et al.
Published: (2024)