:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Huang, Xiaoke, Wang, Jianfeng, Tang, Yansong, Zhang, Zheng, Hu, Han, Lu, Jiwen, Wang, Lijuan, Liu, Zicheng
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2312.00869
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Segment Anything with Motion, Geometry, and Semantic Adaptation for Complex Nonlinear Visual Object Tracking
by: Zhu, Deyi, et al.
Published: (2026)

Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
by: Bai, Sule, et al.
Published: (2024)

SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes
by: Wang, Yuji, et al.
Published: (2025)

Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
by: Lin, Weifeng, et al.
Published: (2025)

ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
by: Lu, Guanxing, et al.
Published: (2024)

Learning to Prompt Segment Anything Models
by: Huang, Jiaxing, et al.
Published: (2024)

Towards Accurate Post-training Quantization for Diffusion Models
by: Wang, Changyuan, et al.
Published: (2023)

Q-VLM: Post-training Quantization for Large Vision-Language Models
by: Wang, Changyuan, et al.
Published: (2024)

GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting
by: Dong, Jiajun, et al.
Published: (2025)

Bring Metric Functions into Diffusion Models
by: An, Jie, et al.
Published: (2024)

VoCo-LLaMA: Towards Vision Compression with Large Language Models
by: Ye, Xubing, et al.
Published: (2024)

PolSAM: Polarimetric Scattering Mechanism Informed Segment Anything Model
by: Wang, Yuqing, et al.
Published: (2024)

Pseudo Depth Meets Gaussian: A Feed-forward RGB SLAM Baseline
by: Zhao, Linqing, et al.
Published: (2025)

Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
by: Tang, Yunlong, et al.
Published: (2025)

LOGO: A Long-Form Video Dataset for Group Action Quality Assessment
by: Zhang, Shiyi, et al.
Published: (2024)

Narrative Action Evaluation with Prompt-Guided Multimodal Interaction
by: Zhang, Shiyi, et al.
Published: (2024)

GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation
by: Zhang, Chubin, et al.
Published: (2024)

MergeSAM: Unsupervised change detection of remote sensing images based on the Segment Anything Model
by: Hu, Meiqi, et al.
Published: (2025)

Register Anything: Estimating "Corresponding Prompts" for Segment Anything Model
by: Huang, Shiqi, et al.
Published: (2025)

Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
by: Yang, Zhengyuan, et al.
Published: (2023)

GoodSAM++: Bridging Domain and Capacity Gaps via Segment Anything Model for Panoramic Semantic Segmentation
by: Zhang, Weiming, et al.
Published: (2024)

TinySAM: Pushing the Envelope for Efficient Segment Anything Model
by: Shu, Han, et al.
Published: (2023)

FlowIE: Efficient Image Enhancement via Rectified Flow
by: Zhu, Yixuan, et al.
Published: (2024)

DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery
by: Zhu, Yixuan, et al.
Published: (2024)

Universal Segmentation at Arbitrary Granularity with Language Instruction
by: Liu, Yong, et al.
Published: (2023)

SAM Meets UAP: Attacking Segment Anything Model With Universal Adversarial Perturbation
by: Han, Dongshen, et al.
Published: (2023)

Matte Anything: Interactive Natural Image Matting with Segment Anything Models
by: Yao, Jingfeng, et al.
Published: (2023)

GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation
by: Zhang, Weiming, et al.
Published: (2024)

I-MedSAM: Implicit Medical Image Segmentation with Segment Anything
by: Wei, Xiaobao, et al.
Published: (2023)

Fully Aligned Network for Referring Image Segmentation
by: Liu, Yong, et al.
Published: (2024)

Continual Learning for Segment Anything Model Adaptation
by: Yang, Jinglong, et al.
Published: (2024)

LiVOS: Light Video Object Segmentation with Gated Linear Matching
by: Liu, Qin, et al.
Published: (2024)

SANeRF-HQ: Segment Anything for NeRF in High Quality
by: Liu, Yichen, et al.
Published: (2023)

Causal Prompt Calibration Guided Segment Anything Model for Open-Vocabulary Multi-Entity Segmentation
by: Wang, Jingyao, et al.
Published: (2025)

OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments
by: Zhang, Chubin, et al.
Published: (2023)

RemoteSAM: Towards Segment Anything for Earth Observation
by: Yao, Liang, et al.
Published: (2025)

Completing Visual Objects via Bridging Generation and Segmentation
by: Li, Xiang, et al.
Published: (2023)

Open-Vocabulary Segmentation with Semantic-Assisted Calibration
by: Liu, Yong, et al.
Published: (2023)

CLAP: Contrastive Latent Action Pretraining for Learning Vision-Language-Action Models from Human Videos
by: Zhang, Chubin, et al.
Published: (2026)

CaptionQA: Is Your Caption as Useful as the Image Itself?
by: Yang, Shijia, et al.
Published: (2025)