Saved in:
| Main Authors: | Pan, Ting, Tang, Lulu, Wang, Xinlong, Shan, Shiguang |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2312.09128 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Autoregressive Video Generation without Vector Quantization
by: Deng, Haoge, et al.
Published: (2024)
by: Deng, Haoge, et al.
Published: (2024)
Anonymization Prompt Learning for Facial Privacy-Preserving Text-to-Image Generation
by: Shi, Liang, et al.
Published: (2024)
by: Shi, Liang, et al.
Published: (2024)
LINA: Linear Autoregressive Image Generative Models with Continuous Tokens
by: Wang, Jiahao, et al.
Published: (2026)
by: Wang, Jiahao, et al.
Published: (2026)
Segment Anything for Videos: A Systematic Survey
by: Zhang, Chunhui, et al.
Published: (2024)
by: Zhang, Chunhui, et al.
Published: (2024)
Uniform Discrete Diffusion with Metric Path for Video Generation
by: Deng, Haoge, et al.
Published: (2025)
by: Deng, Haoge, et al.
Published: (2025)
Generalized Face Liveness Detection via De-fake Face Generator
by: Long, Xingming, et al.
Published: (2024)
by: Long, Xingming, et al.
Published: (2024)
Contrastive Learning of Person-independent Representations for Facial Action Unit Detection
by: Li, Yong, et al.
Published: (2024)
by: Li, Yong, et al.
Published: (2024)
Register Anything: Estimating "Corresponding Prompts" for Segment Anything Model
by: Huang, Shiqi, et al.
Published: (2025)
by: Huang, Shiqi, et al.
Published: (2025)
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
by: Liu, Yang, et al.
Published: (2023)
by: Liu, Yang, et al.
Published: (2023)
MotionVerse: A Unified Multimodal Framework for Motion Comprehension, Generation and Editing
by: Hou, Ruibing, et al.
Published: (2025)
by: Hou, Ruibing, et al.
Published: (2025)
Towards Robust Semantic Segmentation against Patch-based Attack via Attention Refinement
by: Yuan, Zheng, et al.
Published: (2024)
by: Yuan, Zheng, et al.
Published: (2024)
Confidence Aware Learning for Reliable Face Anti-spoofing
by: Long, Xingming, et al.
Published: (2024)
by: Long, Xingming, et al.
Published: (2024)
EntropyScan: Towards Model-level Backdoor Detection in LVLMs via Visual Attention Entropy
by: Ge, Xuanyu, et al.
Published: (2026)
by: Ge, Xuanyu, et al.
Published: (2026)
OSI: One-step Inversion Excels in Extracting Diffusion Watermarks
by: Chen, Yuwei, et al.
Published: (2026)
by: Chen, Yuwei, et al.
Published: (2026)
GLip: A Global-Local Integrated Progressive Framework for Robust Visual Speech Recognition
by: Wang, Tianyue, et al.
Published: (2025)
by: Wang, Tianyue, et al.
Published: (2025)
Dynamic Attention Analysis for Backdoor Detection in Text-to-Image Diffusion Models
by: Wang, Zhongqi, et al.
Published: (2025)
by: Wang, Zhongqi, et al.
Published: (2025)
T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models
by: Wang, Zhongqi, et al.
Published: (2024)
by: Wang, Zhongqi, et al.
Published: (2024)
Assimilation Matters: Model-level Backdoor Detection in Vision-Language Pretrained Models
by: Wang, Zhongqi, et al.
Published: (2025)
by: Wang, Zhongqi, et al.
Published: (2025)
Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness
by: Wang, Sibo, et al.
Published: (2024)
by: Wang, Sibo, et al.
Published: (2024)
3D Segment Anything Model with Visual Mamba for Diagnosing Placenta Accreta Spectrum
by: Zhang, Yuliang, et al.
Published: (2026)
by: Zhang, Yuliang, et al.
Published: (2026)
Learning to Prompt Segment Anything Models
by: Huang, Jiaxing, et al.
Published: (2024)
by: Huang, Jiaxing, et al.
Published: (2024)
You See it, You Got it: Learning 3D Creation on Pose-Free Videos at Scale
by: Ma, Baorui, et al.
Published: (2024)
by: Ma, Baorui, et al.
Published: (2024)
ACT Now: Preempting LVLM Hallucinations via Adaptive Context Integration
by: Yan, Bei, et al.
Published: (2026)
by: Yan, Bei, et al.
Published: (2026)
Leveraging Segment Anything Model for Source-Free Domain Adaptation via Dual Feature Guided Auto-Prompting
by: Huai, Zheang, et al.
Published: (2025)
by: Huai, Zheang, et al.
Published: (2025)
Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating
by: Cao, Xiangkui, et al.
Published: (2026)
by: Cao, Xiangkui, et al.
Published: (2026)
UDD: Dataset Distillation via Mining Underutilized Regions
by: Wang, Shiguang, et al.
Published: (2024)
by: Wang, Shiguang, et al.
Published: (2024)
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
by: Tang, Yunlong, et al.
Published: (2025)
by: Tang, Yunlong, et al.
Published: (2025)
Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting
by: Goncharov, Nikolai, et al.
Published: (2024)
by: Goncharov, Nikolai, et al.
Published: (2024)
HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention
by: Tang, Xiaolong, et al.
Published: (2024)
by: Tang, Xiaolong, et al.
Published: (2024)
CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation
by: Xu, Yifeng, et al.
Published: (2024)
by: Xu, Yifeng, et al.
Published: (2024)
Semantic or Covariate? A Study on the Intractable Case of Out-of-Distribution Detection
by: Long, Xingming, et al.
Published: (2024)
by: Long, Xingming, et al.
Published: (2024)
BIMM: Brain Inspired Masked Modeling for Video Representation Learning
by: Wan, Zhifan, et al.
Published: (2024)
by: Wan, Zhifan, et al.
Published: (2024)
VOPE: Revisiting Hallucination of Vision-Language Models in Voluntary Imagination Task
by: Long, Xingming, et al.
Published: (2025)
by: Long, Xingming, et al.
Published: (2025)
FullLoRA: Efficiently Boosting the Robustness of Pretrained Vision Transformers
by: Yuan, Zheng, et al.
Published: (2024)
by: Yuan, Zheng, et al.
Published: (2024)
Rethinking the Evaluation of Out-of-Distribution Detection: A Sorites Paradox
by: Long, Xingming, et al.
Published: (2024)
by: Long, Xingming, et al.
Published: (2024)
Segment and Caption Anything
by: Huang, Xiaoke, et al.
Published: (2023)
by: Huang, Xiaoke, et al.
Published: (2023)
SAMIC: Segment Anything with In-Context Spatial Prompt Engineering
by: Nagendra, Savinay, et al.
Published: (2024)
by: Nagendra, Savinay, et al.
Published: (2024)
Benchmarking Human and Automated Prompting in the Segment Anything Model
by: Quesada, Jorge, et al.
Published: (2024)
by: Quesada, Jorge, et al.
Published: (2024)
From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos
by: Chen, Yin, et al.
Published: (2023)
by: Chen, Yin, et al.
Published: (2023)
Contrastive Spectral Rectification: Test-Time Defense towards Zero-shot Adversarial Robustness of CLIP
by: Nie, Sen, et al.
Published: (2026)
by: Nie, Sen, et al.
Published: (2026)
Similar Items
-
Autoregressive Video Generation without Vector Quantization
by: Deng, Haoge, et al.
Published: (2024) -
Anonymization Prompt Learning for Facial Privacy-Preserving Text-to-Image Generation
by: Shi, Liang, et al.
Published: (2024) -
LINA: Linear Autoregressive Image Generative Models with Continuous Tokens
by: Wang, Jiahao, et al.
Published: (2026) -
Segment Anything for Videos: A Systematic Survey
by: Zhang, Chunhui, et al.
Published: (2024) -
Uniform Discrete Diffusion with Metric Path for Video Generation
by: Deng, Haoge, et al.
Published: (2025)