Saved in:
| Main Authors: | Wu, Junde, Zhu, Jiayuan, Xu, Min, Jin, Yueming |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.05703 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
One-Prompt to Segment All Medical Images
by: Wu, Junde, et al.
Published: (2023)
by: Wu, Junde, et al.
Published: (2023)
Medical SAM 2: Segment medical images as video via Segment Anything Model 2
by: Zhu, Jiayuan, et al.
Published: (2024)
by: Zhu, Jiayuan, et al.
Published: (2024)
MGI: Multimodal Contrastive pre-training of Genomic and Medical Imaging
by: Zhou, Jiaying, et al.
Published: (2024)
by: Zhou, Jiaying, et al.
Published: (2024)
MedUHIP: Towards Human-In-the-Loop Medical Segmentation
by: Zhu, Jiayuan, et al.
Published: (2024)
by: Zhu, Jiayuan, et al.
Published: (2024)
Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation
by: Wu, Junde, et al.
Published: (2023)
by: Wu, Junde, et al.
Published: (2023)
Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation
by: Wu, Junde, et al.
Published: (2024)
by: Wu, Junde, et al.
Published: (2024)
MedVAR: Towards Scalable and Efficient Medical Image Generation via Next-scale Autoregressive Prediction
by: He, Zhicheng, et al.
Published: (2026)
by: He, Zhicheng, et al.
Published: (2026)
Towards Collective Intelligence: Uncertainty-aware SAM Adaptation for Ambiguous Medical Image Segmentation
by: Jiang, Mingzhou, et al.
Published: (2024)
by: Jiang, Mingzhou, et al.
Published: (2024)
MedOpenClaw and MedFlowBench: Auditing Medical Agents in Full-Study Workflows
by: Shen, Weixiang, et al.
Published: (2026)
by: Shen, Weixiang, et al.
Published: (2026)
Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning
by: Liu, Haofeng, et al.
Published: (2024)
by: Liu, Haofeng, et al.
Published: (2024)
SPA: Efficient User-Preference Alignment against Uncertainty in Medical Image Segmentation
by: Zhu, Jiayuan, et al.
Published: (2024)
by: Zhu, Jiayuan, et al.
Published: (2024)
Scalable Object Detection in the Car Interior With Vision Foundation Models
by: Schmidt, Sebastian, et al.
Published: (2025)
by: Schmidt, Sebastian, et al.
Published: (2025)
From Failure to Feedback: Group Revision Unlocks Hard Cases in Object-Level Grounding
by: Liu, Yuyuan, et al.
Published: (2026)
by: Liu, Yuyuan, et al.
Published: (2026)
in-Car Biometrics (iCarB) Datasets for Driver Recognition: Face, Fingerprint, and Voice
by: Hahn, Vedrana Krivokuca, et al.
Published: (2024)
by: Hahn, Vedrana Krivokuca, et al.
Published: (2024)
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractor
by: Chen, Jiali, et al.
Published: (2024)
by: Chen, Jiali, et al.
Published: (2024)
Structure Matters: Revisiting Boundary Refinement in Video Object Segmentation
by: Qin, Guanyi, et al.
Published: (2025)
by: Qin, Guanyi, et al.
Published: (2025)
ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking
by: Liu, Haofeng, et al.
Published: (2025)
by: Liu, Haofeng, et al.
Published: (2025)
DreamCar: Leveraging Car-specific Prior for in-the-wild 3D Car Reconstruction
by: Du, Xiaobiao, et al.
Published: (2024)
by: Du, Xiaobiao, et al.
Published: (2024)
Car-GS: Addressing Reflective and Transparent Surface Challenges in 3D Car Reconstruction
by: Li, Congcong, et al.
Published: (2025)
by: Li, Congcong, et al.
Published: (2025)
MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning
by: Pan, Jiazhen, et al.
Published: (2025)
by: Pan, Jiazhen, et al.
Published: (2025)
Visual WetlandBirds Dataset: Bird Species Identification and Behavior Recognition in Videos
by: Rodriguez-Juan, Javier, et al.
Published: (2025)
by: Rodriguez-Juan, Javier, et al.
Published: (2025)
3DMedAgent: Unified Perception-to-Understanding for 3D Medical Analysis
by: Wang, Ziyue, et al.
Published: (2026)
by: Wang, Ziyue, et al.
Published: (2026)
3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views
by: Du, Xiaobiao, et al.
Published: (2024)
by: Du, Xiaobiao, et al.
Published: (2024)
An Effective End-to-End Solution for Multimodal Action Recognition
by: Wang, Songping, et al.
Published: (2025)
by: Wang, Songping, et al.
Published: (2025)
Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition
by: Peng, Jianyi, et al.
Published: (2025)
by: Peng, Jianyi, et al.
Published: (2025)
Infrared Adversarial Car Stickers
by: Zhu, Xiaopei, et al.
Published: (2024)
by: Zhu, Xiaopei, et al.
Published: (2024)
From Articulated Kinematics to Routed Visual Control for Action-Conditioned Surgical Video Generation
by: Li, Bohan, et al.
Published: (2026)
by: Li, Bohan, et al.
Published: (2026)
DiffusionAgent: Navigating Expert Models for Agentic Image Generation
by: Qin, Jie, et al.
Published: (2024)
by: Qin, Jie, et al.
Published: (2024)
ToolTipNet: A Segmentation-Driven Deep Learning Baseline for Surgical Instrument Tip Detection
by: Wu, Zijian, et al.
Published: (2025)
by: Wu, Zijian, et al.
Published: (2025)
Improving Visual Recognition with Hyperbolical Visual Hierarchy Mapping
by: Kwon, Hyeongjun, et al.
Published: (2024)
by: Kwon, Hyeongjun, et al.
Published: (2024)
Generalized Deep Multi-view Clustering via Causal Learning with Partially Aligned Cross-view Correspondence
by: Yang, Xihong, et al.
Published: (2025)
by: Yang, Xihong, et al.
Published: (2025)
You Only Look at Once for Real-time and Generic Multi-Task
by: Wang, Jiayuan, et al.
Published: (2023)
by: Wang, Jiayuan, et al.
Published: (2023)
GeRM: A Generative Rendering Model From Physically Realistic to Photorealistic
by: Lu, Jiayuan, et al.
Published: (2026)
by: Lu, Jiayuan, et al.
Published: (2026)
DTL: Disentangled Transfer Learning for Visual Recognition
by: Fu, Minghao, et al.
Published: (2023)
by: Fu, Minghao, et al.
Published: (2023)
Exploiting Polarized Material Cues for Robust Car Detection
by: Dong, Wen, et al.
Published: (2024)
by: Dong, Wen, et al.
Published: (2024)
Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation
by: Zhu, Junyu, et al.
Published: (2023)
by: Zhu, Junyu, et al.
Published: (2023)
TCFormer: Visual Recognition via Token Clustering Transformer
by: Zeng, Wang, et al.
Published: (2024)
by: Zeng, Wang, et al.
Published: (2024)
AuralSAM2: Enabling SAM2 Hear Through Pyramid Audio-Visual Feature Prompting
by: Liu, Yuyuan, et al.
Published: (2025)
by: Liu, Yuyuan, et al.
Published: (2025)
T2Vs Meet VLMs: A Scalable Multimodal Dataset for Visual Harmfulness Recognition
by: Yeh, Chen, et al.
Published: (2024)
by: Yeh, Chen, et al.
Published: (2024)
Generalized Recognition of Basic Surgical Actions Enables Skill Assessment and Vision-Language-Model-based Surgical Planning
by: Xu, Mengya, et al.
Published: (2026)
by: Xu, Mengya, et al.
Published: (2026)
Similar Items
-
One-Prompt to Segment All Medical Images
by: Wu, Junde, et al.
Published: (2023) -
Medical SAM 2: Segment medical images as video via Segment Anything Model 2
by: Zhu, Jiayuan, et al.
Published: (2024) -
MGI: Multimodal Contrastive pre-training of Genomic and Medical Imaging
by: Zhou, Jiaying, et al.
Published: (2024) -
MedUHIP: Towards Human-In-the-Loop Medical Segmentation
by: Zhu, Jiayuan, et al.
Published: (2024) -
Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation
by: Wu, Junde, et al.
Published: (2023)