Saved in:
| Main Authors: | Huang, Xiaoke, Wang, Jianfeng, Tang, Yansong, Zhang, Zheng, Hu, Han, Lu, Jiwen, Wang, Lijuan, Liu, Zicheng |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2312.00869 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Segment Anything with Motion, Geometry, and Semantic Adaptation for Complex Nonlinear Visual Object Tracking
by: Zhu, Deyi, et al.
Published: (2026)
by: Zhu, Deyi, et al.
Published: (2026)
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
by: Bai, Sule, et al.
Published: (2024)
by: Bai, Sule, et al.
Published: (2024)
SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes
by: Wang, Yuji, et al.
Published: (2025)
by: Wang, Yuji, et al.
Published: (2025)
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
by: Lin, Weifeng, et al.
Published: (2025)
by: Lin, Weifeng, et al.
Published: (2025)
ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
by: Lu, Guanxing, et al.
Published: (2024)
by: Lu, Guanxing, et al.
Published: (2024)
Learning to Prompt Segment Anything Models
by: Huang, Jiaxing, et al.
Published: (2024)
by: Huang, Jiaxing, et al.
Published: (2024)
Towards Accurate Post-training Quantization for Diffusion Models
by: Wang, Changyuan, et al.
Published: (2023)
by: Wang, Changyuan, et al.
Published: (2023)
Q-VLM: Post-training Quantization for Large Vision-Language Models
by: Wang, Changyuan, et al.
Published: (2024)
by: Wang, Changyuan, et al.
Published: (2024)
GaussianToken: An Effective Image Tokenizer with 2D Gaussian Splatting
by: Dong, Jiajun, et al.
Published: (2025)
by: Dong, Jiajun, et al.
Published: (2025)
Bring Metric Functions into Diffusion Models
by: An, Jie, et al.
Published: (2024)
by: An, Jie, et al.
Published: (2024)
VoCo-LLaMA: Towards Vision Compression with Large Language Models
by: Ye, Xubing, et al.
Published: (2024)
by: Ye, Xubing, et al.
Published: (2024)
PolSAM: Polarimetric Scattering Mechanism Informed Segment Anything Model
by: Wang, Yuqing, et al.
Published: (2024)
by: Wang, Yuqing, et al.
Published: (2024)
Pseudo Depth Meets Gaussian: A Feed-forward RGB SLAM Baseline
by: Zhao, Linqing, et al.
Published: (2025)
by: Zhao, Linqing, et al.
Published: (2025)
Caption Anything in Video: Fine-grained Object-centric Captioning via Spatiotemporal Multimodal Prompting
by: Tang, Yunlong, et al.
Published: (2025)
by: Tang, Yunlong, et al.
Published: (2025)
LOGO: A Long-Form Video Dataset for Group Action Quality Assessment
by: Zhang, Shiyi, et al.
Published: (2024)
by: Zhang, Shiyi, et al.
Published: (2024)
Narrative Action Evaluation with Prompt-Guided Multimodal Interaction
by: Zhang, Shiyi, et al.
Published: (2024)
by: Zhang, Shiyi, et al.
Published: (2024)
GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation
by: Zhang, Chubin, et al.
Published: (2024)
by: Zhang, Chubin, et al.
Published: (2024)
MergeSAM: Unsupervised change detection of remote sensing images based on the Segment Anything Model
by: Hu, Meiqi, et al.
Published: (2025)
by: Hu, Meiqi, et al.
Published: (2025)
Register Anything: Estimating "Corresponding Prompts" for Segment Anything Model
by: Huang, Shiqi, et al.
Published: (2025)
by: Huang, Shiqi, et al.
Published: (2025)
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation
by: Yang, Zhengyuan, et al.
Published: (2023)
by: Yang, Zhengyuan, et al.
Published: (2023)
GoodSAM++: Bridging Domain and Capacity Gaps via Segment Anything Model for Panoramic Semantic Segmentation
by: Zhang, Weiming, et al.
Published: (2024)
by: Zhang, Weiming, et al.
Published: (2024)
TinySAM: Pushing the Envelope for Efficient Segment Anything Model
by: Shu, Han, et al.
Published: (2023)
by: Shu, Han, et al.
Published: (2023)
FlowIE: Efficient Image Enhancement via Rectified Flow
by: Zhu, Yixuan, et al.
Published: (2024)
by: Zhu, Yixuan, et al.
Published: (2024)
DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery
by: Zhu, Yixuan, et al.
Published: (2024)
by: Zhu, Yixuan, et al.
Published: (2024)
Universal Segmentation at Arbitrary Granularity with Language Instruction
by: Liu, Yong, et al.
Published: (2023)
by: Liu, Yong, et al.
Published: (2023)
SAM Meets UAP: Attacking Segment Anything Model With Universal Adversarial Perturbation
by: Han, Dongshen, et al.
Published: (2023)
by: Han, Dongshen, et al.
Published: (2023)
Matte Anything: Interactive Natural Image Matting with Segment Anything Models
by: Yao, Jingfeng, et al.
Published: (2023)
by: Yao, Jingfeng, et al.
Published: (2023)
GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation
by: Zhang, Weiming, et al.
Published: (2024)
by: Zhang, Weiming, et al.
Published: (2024)
I-MedSAM: Implicit Medical Image Segmentation with Segment Anything
by: Wei, Xiaobao, et al.
Published: (2023)
by: Wei, Xiaobao, et al.
Published: (2023)
Fully Aligned Network for Referring Image Segmentation
by: Liu, Yong, et al.
Published: (2024)
by: Liu, Yong, et al.
Published: (2024)
Continual Learning for Segment Anything Model Adaptation
by: Yang, Jinglong, et al.
Published: (2024)
by: Yang, Jinglong, et al.
Published: (2024)
LiVOS: Light Video Object Segmentation with Gated Linear Matching
by: Liu, Qin, et al.
Published: (2024)
by: Liu, Qin, et al.
Published: (2024)
SANeRF-HQ: Segment Anything for NeRF in High Quality
by: Liu, Yichen, et al.
Published: (2023)
by: Liu, Yichen, et al.
Published: (2023)
Causal Prompt Calibration Guided Segment Anything Model for Open-Vocabulary Multi-Entity Segmentation
by: Wang, Jingyao, et al.
Published: (2025)
by: Wang, Jingyao, et al.
Published: (2025)
OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments
by: Zhang, Chubin, et al.
Published: (2023)
by: Zhang, Chubin, et al.
Published: (2023)
RemoteSAM: Towards Segment Anything for Earth Observation
by: Yao, Liang, et al.
Published: (2025)
by: Yao, Liang, et al.
Published: (2025)
Completing Visual Objects via Bridging Generation and Segmentation
by: Li, Xiang, et al.
Published: (2023)
by: Li, Xiang, et al.
Published: (2023)
Open-Vocabulary Segmentation with Semantic-Assisted Calibration
by: Liu, Yong, et al.
Published: (2023)
by: Liu, Yong, et al.
Published: (2023)
CLAP: Contrastive Latent Action Pretraining for Learning Vision-Language-Action Models from Human Videos
by: Zhang, Chubin, et al.
Published: (2026)
by: Zhang, Chubin, et al.
Published: (2026)
CaptionQA: Is Your Caption as Useful as the Image Itself?
by: Yang, Shijia, et al.
Published: (2025)
by: Yang, Shijia, et al.
Published: (2025)
Similar Items
-
Segment Anything with Motion, Geometry, and Semantic Adaptation for Complex Nonlinear Visual Object Tracking
by: Zhu, Deyi, et al.
Published: (2026) -
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
by: Bai, Sule, et al.
Published: (2024) -
SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes
by: Wang, Yuji, et al.
Published: (2025) -
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
by: Lin, Weifeng, et al.
Published: (2025) -
ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation
by: Lu, Guanxing, et al.
Published: (2024)