Saved in:
| Main Authors: | Zang, Ying, Hu, Yuanqi, Chen, Xinyu, Xu, Yuxia, Wang, Suhui, Yu, Chunan, Zhu, Lanyun, Ji, Deyi, Xu, Xin, Chen, Tianrun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.09998 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models
by: Chen, Tianrun, et al.
Published: (2024)
by: Chen, Tianrun, et al.
Published: (2024)
xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart
by: Chen, Tianrun, et al.
Published: (2024)
by: Chen, Tianrun, et al.
Published: (2024)
Img2CAD: Conditioned 3D CAD Model Generation from Single Image with Structured Visual Geometry
by: Chen, Tianrun, et al.
Published: (2024)
by: Chen, Tianrun, et al.
Published: (2024)
Syllables to Scenes: Literary-Guided Free-Viewpoint 3D Scene Synthesis from Japanese Haiku
by: Yu, Chunan, et al.
Published: (2025)
by: Yu, Chunan, et al.
Published: (2025)
Breaking the Box: Enhancing Remote Sensing Image Segmentation with Freehand Sketches
by: Zang, Ying, et al.
Published: (2025)
by: Zang, Ying, et al.
Published: (2025)
HD-VGGT: High-Resolution Visual Geometry Transformer
by: Chen, Tianrun, et al.
Published: (2026)
by: Chen, Tianrun, et al.
Published: (2026)
Robust 4D Visual Geometry Transformer with Uncertainty-Aware Priors
by: Zang, Ying, et al.
Published: (2026)
by: Zang, Ying, et al.
Published: (2026)
Not Every Patch is Needed: Towards a More Efficient and Effective Backbone for Video-based Person Re-identification
by: Zhu, Lanyun, et al.
Published: (2025)
by: Zhu, Lanyun, et al.
Published: (2025)
SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More
by: Chen, Tianrun, et al.
Published: (2024)
by: Chen, Tianrun, et al.
Published: (2024)
Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches
by: Zang, Ying, et al.
Published: (2025)
by: Zang, Ying, et al.
Published: (2025)
4DVGGT-D: 4D Visual Geometry Transformer with Improved Dynamic Depth Estimation
by: Zang, Ying, et al.
Published: (2026)
by: Zang, Ying, et al.
Published: (2026)
IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding
by: Zhu, Lanyun, et al.
Published: (2024)
by: Zhu, Lanyun, et al.
Published: (2024)
Retrv-R1: A Reasoning-Driven MLLM Framework for Universal and Efficient Multimodal Retrieval
by: Zhu, Lanyun, et al.
Published: (2025)
by: Zhu, Lanyun, et al.
Published: (2025)
LLaFS: When Large Language Models Meet Few-Shot Segmentation
by: Zhu, Lanyun, et al.
Published: (2023)
by: Zhu, Lanyun, et al.
Published: (2023)
Magic3DSketch: Create Colorful 3D Models From Sketch-Based 3D Modeling Guided by Text and Language-Image Pre-Training
by: Zang, Ying, et al.
Published: (2024)
by: Zang, Ying, et al.
Published: (2024)
StreamCacheVGGT: Streaming Visual Geometry Transformers with Robust Scoring and Hybrid Cache Compression
by: Liu, Xuanyi, et al.
Published: (2026)
by: Liu, Xuanyi, et al.
Published: (2026)
SAM3-Adapter: Efficient Adaptation of Segment Anything 3 for Camouflage Object Segmentation, Shadow Detection, and Medical Image Segmentation
by: Chen, Tianrun, et al.
Published: (2025)
by: Chen, Tianrun, et al.
Published: (2025)
RAVEN: Robust Advertisement Video Violation Temporal Grounding via Reinforcement Reasoning
by: Ji, Deyi, et al.
Published: (2025)
by: Ji, Deyi, et al.
Published: (2025)
POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation
by: Zhu, Lanyun, et al.
Published: (2025)
by: Zhu, Lanyun, et al.
Published: (2025)
Towards 3D VR-Sketch to 3D Shape Retrieval
by: Luo, Ling, et al.
Published: (2022)
by: Luo, Ling, et al.
Published: (2022)
3D VR Sketch Guided 3D Shape Prototyping and Exploration
by: Luo, Ling, et al.
Published: (2023)
by: Luo, Ling, et al.
Published: (2023)
RESMatch: Referring Expression Segmentation in a Semi-Supervised Manner
by: Zang, Ying, et al.
Published: (2024)
by: Zang, Ying, et al.
Published: (2024)
CamGeo: Sparse Camera-Conditioned Image-to-Video Generation with 3D Geometry Priors
by: Liu, Xuanyi, et al.
Published: (2026)
by: Liu, Xuanyi, et al.
Published: (2026)
Self-signals Driven Multi-LLM Debate for Efficient and Accurate Reasoning
by: Chen, Xuhang, et al.
Published: (2025)
by: Chen, Xuhang, et al.
Published: (2025)
Order Matters: 3D Shape Generation from Sequential VR Sketches
by: Chen, Yizi, et al.
Published: (2025)
by: Chen, Yizi, et al.
Published: (2025)
RAVEN++: Pinpointing Fine-Grained Violations in Advertisement Videos with Active Reinforcement Reasoning
by: Ji, Deyi, et al.
Published: (2025)
by: Ji, Deyi, et al.
Published: (2025)
RAG-VR: Leveraging Retrieval-Augmented Generation for 3D Question Answering in VR Environments
by: Ding, Shiyi, et al.
Published: (2025)
by: Ding, Shiyi, et al.
Published: (2025)
Sketch3D: Style-Consistent Guidance for Sketch-to-3D Generation
by: Zheng, Wangguandong, et al.
Published: (2024)
by: Zheng, Wangguandong, et al.
Published: (2024)
VRsketch2Gaussian: 3D VR Sketch Guided 3D Object Generation with Gaussian Splatting
by: Gu, Songen, et al.
Published: (2025)
by: Gu, Songen, et al.
Published: (2025)
Tree-of-Table: Unleashing the Power of LLMs for Enhanced Large-Scale Table Understanding
by: Ji, Deyi, et al.
Published: (2024)
by: Ji, Deyi, et al.
Published: (2024)
Air‐Stable Hydrogen‐Substituted Graphdiyne/2D Halide Perovskite Heterojunction for Self‐Powered Neuromorphic Vision
by: Yiming Yuan, et al.
Published: (2025)
by: Yiming Yuan, et al.
Published: (2025)
Beyond Geometry: Artistic Disparity Synthesis for Immersive 2D-to-3D
by: Chen, Ping, et al.
Published: (2026)
by: Chen, Ping, et al.
Published: (2026)
FashionComposer: Compositional Fashion Image Generation
by: Ji, Sihui, et al.
Published: (2024)
by: Ji, Sihui, et al.
Published: (2024)
ImViD: Immersive Volumetric Videos for Enhanced VR Engagement
by: Yang, Zhengxian, et al.
Published: (2025)
by: Yang, Zhengxian, et al.
Published: (2025)
Instant-3D: Instant Neural Radiance Field Training Towards On-Device AR/VR 3D Reconstruction
by: Li, Sixu, et al.
Published: (2023)
by: Li, Sixu, et al.
Published: (2023)
Claw AI Lab: An Autonomous Multi-Agent Research Team
by: Wu, Fan, et al.
Published: (2026)
by: Wu, Fan, et al.
Published: (2026)
Zero-Shot 3D Drug Design by Sketching and Generating
by: Long, Siyu, et al.
Published: (2022)
by: Long, Siyu, et al.
Published: (2022)
StreamSense: Streaming Social Task Detection with Selective Vision-Language Model Routing
by: Wang, Han, et al.
Published: (2026)
by: Wang, Han, et al.
Published: (2026)
Discrete Latent Perspective Learning for Segmentation and Detection
by: Ji, Deyi, et al.
Published: (2024)
by: Ji, Deyi, et al.
Published: (2024)
SketchPlay: Intuitive Creation of Physically Realistic VR Content with Gesture-Driven Sketching
by: Zhang, Xiangwen, et al.
Published: (2025)
by: Zhang, Xiangwen, et al.
Published: (2025)
Similar Items
-
Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models
by: Chen, Tianrun, et al.
Published: (2024) -
xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart
by: Chen, Tianrun, et al.
Published: (2024) -
Img2CAD: Conditioned 3D CAD Model Generation from Single Image with Structured Visual Geometry
by: Chen, Tianrun, et al.
Published: (2024) -
Syllables to Scenes: Literary-Guided Free-Viewpoint 3D Scene Synthesis from Japanese Haiku
by: Yu, Chunan, et al.
Published: (2025) -
Breaking the Box: Enhancing Remote Sensing Image Segmentation with Freehand Sketches
by: Zang, Ying, et al.
Published: (2025)