:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pan, Ting, Tang, Lulu, Wang, Xinlong, Shan, Shiguang
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2312.09128
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Autoregressive Video Generation without Vector Quantization
by: Deng, Haoge, et al.
Published: (2024)

Anonymization Prompt Learning for Facial Privacy-Preserving Text-to-Image Generation
by: Shi, Liang, et al.
Published: (2024)

LINA: Linear Autoregressive Image Generative Models with Continuous Tokens
by: Wang, Jiahao, et al.
Published: (2026)

Segment Anything for Videos: A Systematic Survey
by: Zhang, Chunhui, et al.
Published: (2024)

Uniform Discrete Diffusion with Metric Path for Video Generation
by: Deng, Haoge, et al.
Published: (2025)

Generalized Face Liveness Detection via De-fake Face Generator
by: Long, Xingming, et al.
Published: (2024)

Contrastive Learning of Person-independent Representations for Facial Action Unit Detection
by: Li, Yong, et al.
Published: (2024)

Register Anything: Estimating "Corresponding Prompts" for Segment Anything Model
by: Huang, Shiqi, et al.
Published: (2025)

Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
by: Liu, Yang, et al.
Published: (2023)

MotionVerse: A Unified Multimodal Framework for Motion Comprehension, Generation and Editing
by: Hou, Ruibing, et al.
Published: (2025)

Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement
by: Yuan, Zheng, et al.
Published: (2024)

Confidence Aware Learning for Reliable Face Anti-spoofing
by: Long, Xingming, et al.
Published: (2024)

EntropyScan: Towards Model-level Backdoor Detection in LVLMs via Visual Attention Entropy
by: Ge, Xuanyu, et al.
Published: (2026)

OSI: One-step Inversion Excels in Extracting Diffusion Watermarks
by: Chen, Yuwei, et al.
Published: (2026)

GLip: A Global-Local Integrated Progressive Framework for Robust Visual Speech Recognition
by: Wang, Tianyue, et al.
Published: (2025)

Dynamic Attention Analysis for Backdoor Detection in Text-to-Image Diffusion Models
by: Wang, Zhongqi, et al.
Published: (2025)

T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models
by: Wang, Zhongqi, et al.
Published: (2024)

Assimilation Matters: Model-level Backdoor Detection in Vision-Language Pretrained Models
by: Wang, Zhongqi, et al.
Published: (2025)

Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness
by: Wang, Sibo, et al.
Published: (2024)

3D Segment Anything Model with Visual Mamba for Diagnosing Placenta Accreta Spectrum
by: Zhang, Yuliang, et al.
Published: (2026)

Learning to Prompt Segment Anything Models
by: Huang, Jiaxing, et al.
Published: (2024)

You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
by: Ma, Baorui, et al.
Published: (2024)

ACT Now: Preempting LVLM Hallucinations via Adaptive Context Integration
by: Yan, Bei, et al.
Published: (2026)

Leveraging Segment Anything Model for Source-Free Domain Adaptation via Dual Feature Guided Auto-Prompting
by: Huai, Zheang, et al.
Published: (2025)

Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating
by: Cao, Xiangkui, et al.
Published: (2026)

UDD: Dataset Distillation via Mining Underutilized Regions
by: Wang, Shiguang, et al.
Published: (2024)

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
by: Tang, Yunlong, et al.
Published: (2025)

Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting
by: Goncharov, Nikolai, et al.
Published: (2024)

HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention
by: Tang, Xiaolong, et al.
Published: (2024)

CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation
by: Xu, Yifeng, et al.
Published: (2024)

Semantic or Covariate? A Study on the Intractable Case of Out-of-Distribution Detection
by: Long, Xingming, et al.
Published: (2024)

BIMM: Brain Inspired Masked Modeling for Video Representation Learning
by: Wan, Zhifan, et al.
Published: (2024)

VOPE: Revisiting Hallucination of Vision-Language Models in Voluntary Imagination Task
by: Long, Xingming, et al.
Published: (2025)

FullLoRA: Efficiently Boosting the Robustness of Pretrained Vision Transformers
by: Yuan, Zheng, et al.
Published: (2024)

Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox
by: Long, Xingming, et al.
Published: (2024)

Segment and Caption Anything
by: Huang, Xiaoke, et al.
Published: (2023)

SAMIC: Segment Anything with In-Context Spatial Prompt Engineering
by: Nagendra, Savinay, et al.
Published: (2024)

Benchmarking Human and Automated Prompting in the Segment Anything Model
by: Quesada, Jorge, et al.
Published: (2024)

From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos
by: Chen, Yin, et al.
Published: (2023)

Contrastive Spectral Rectification: Test-Time Defense towards Zero-shot Adversarial Robustness of CLIP
by: Nie, Sen, et al.
Published: (2026)