Saved in:
| Main Authors: | Zhang, Songyan, Sun, Xinyu, Chen, Hao, Li, Bo, Shen, Chunhua |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.11755 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction
by: Zhang, Songyan, et al.
Published: (2025)
by: Zhang, Songyan, et al.
Published: (2025)
Digging Into Normal Incorporated Stereo Matching
by: Liu, Zihua, et al.
Published: (2024)
by: Liu, Zihua, et al.
Published: (2024)
RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image
by: Chen, Xiaoxue, et al.
Published: (2024)
by: Chen, Xiaoxue, et al.
Published: (2024)
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
by: Liu, Yang, et al.
Published: (2023)
by: Liu, Yang, et al.
Published: (2023)
WiseAD: Knowledge Augmented End-to-End Autonomous Driving with Vision-Language Model
by: Zhang, Songyan, et al.
Published: (2024)
by: Zhang, Songyan, et al.
Published: (2024)
SpecRef: A Fast Training-free Baseline of Specific Reference-Condition Real Image Editing
by: Chen, Songyan, et al.
Published: (2024)
by: Chen, Songyan, et al.
Published: (2024)
StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation
by: Liu, Mingyu, et al.
Published: (2025)
by: Liu, Mingyu, et al.
Published: (2025)
Resolve Domain Conflicts for Generalizable Remote Physiological Measurement
by: Sun, Weiyu, et al.
Published: (2024)
by: Sun, Weiyu, et al.
Published: (2024)
VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks
by: Chu, Xiangxiang, et al.
Published: (2024)
by: Chu, Xiangxiang, et al.
Published: (2024)
MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching
by: Liu, Yepeng, et al.
Published: (2025)
by: Liu, Yepeng, et al.
Published: (2025)
Bridge Thinking and Acting: Unleashing Physical Potential of VLM with Generalizable Action Expert
by: Liu, Mingyu, et al.
Published: (2025)
by: Liu, Mingyu, et al.
Published: (2025)
GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models
by: Ge, Yongtao, et al.
Published: (2024)
by: Ge, Yongtao, et al.
Published: (2024)
MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequence
by: Zhao, Canyu, et al.
Published: (2024)
by: Zhao, Canyu, et al.
Published: (2024)
DMS:Diffusion-Based Multi-Baseline Stereo Generation for Improving Self-Supervised Depth Estimation
by: Liu, Zihua, et al.
Published: (2025)
by: Liu, Zihua, et al.
Published: (2025)
A CLIP-Powered Framework for Robust and Generalizable Data Selection
by: Yang, Suorong, et al.
Published: (2024)
by: Yang, Suorong, et al.
Published: (2024)
Learning Efficient and Generalizable Human Representation with Human Gaussian Model
by: Liu, Yifan, et al.
Published: (2025)
by: Liu, Yifan, et al.
Published: (2025)
DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks
by: Zhao, Canyu, et al.
Published: (2025)
by: Zhao, Canyu, et al.
Published: (2025)
Generalizable Single-view Object Pose Estimation by Two-side Generating and Matching
by: Sun, Yujing, et al.
Published: (2024)
by: Sun, Yujing, et al.
Published: (2024)
DriveX: Omni Scene Modeling for Learning Generalizable World Knowledge in Autonomous Driving
by: Shi, Chen, et al.
Published: (2025)
by: Shi, Chen, et al.
Published: (2025)
Explicit Correspondence Matching for Generalizable Neural Radiance Fields
by: Chen, Yuedong, et al.
Published: (2023)
by: Chen, Yuedong, et al.
Published: (2023)
OmniGlue: Generalizable Feature Matching with Foundation Model Guidance
by: Jiang, Hanwen, et al.
Published: (2024)
by: Jiang, Hanwen, et al.
Published: (2024)
Diffusion Models are Efficient Data Generators for Human Mesh Recovery
by: Ge, Yongtao, et al.
Published: (2024)
by: Ge, Yongtao, et al.
Published: (2024)
Tinker: Diffusion's Gift to 3D--Multi-View Consistent Editing From Sparse Inputs without Per-Scene Optimization
by: Zhao, Canyu, et al.
Published: (2025)
by: Zhao, Canyu, et al.
Published: (2025)
MobileVLM V2: Faster and Stronger Baseline for Vision Language Model
by: Chu, Xiangxiang, et al.
Published: (2024)
by: Chu, Xiangxiang, et al.
Published: (2024)
Frequency Prior Guided Matching: A Data Augmentation Approach for Generalizable Semi-Supervised Polyp Segmentation
by: Xi, Haoran, et al.
Published: (2025)
by: Xi, Haoran, et al.
Published: (2025)
Unlocking the Power of Critical Factors for 3D Visual Geometry Estimation
by: Xu, Guangkai, et al.
Published: (2026)
by: Xu, Guangkai, et al.
Published: (2026)
A Simple Image Segmentation Framework via In-Context Examples
by: Liu, Yang, et al.
Published: (2024)
by: Liu, Yang, et al.
Published: (2024)
CFDNet: A Generalizable Foggy Stereo Matching Network with Contrastive Feature Distillation
by: Liu, Zihua, et al.
Published: (2024)
by: Liu, Zihua, et al.
Published: (2024)
Training-Free Motion Customization for Distilled Video Generators with Adaptive Test-Time Distillation
by: Rong, Jintao, et al.
Published: (2025)
by: Rong, Jintao, et al.
Published: (2025)
SurfaceSplat: Connecting Surface Reconstruction and Gaussian Splatting
by: Gao, Zihui, et al.
Published: (2025)
by: Gao, Zihui, et al.
Published: (2025)
Unified Open-World Segmentation with Multi-Modal Prompts
by: Liu, Yang, et al.
Published: (2025)
by: Liu, Yang, et al.
Published: (2025)
FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior
by: Chen, Zhekai, et al.
Published: (2024)
by: Chen, Zhekai, et al.
Published: (2024)
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices
by: Chu, Xiangxiang, et al.
Published: (2023)
by: Chu, Xiangxiang, et al.
Published: (2023)
Where to Look: Can Foundation Models Reach a Target Viewpoint Through Active Exploration?
by: Li, Liyang, et al.
Published: (2026)
by: Li, Liyang, et al.
Published: (2026)
Meta-FC: Meta-Learning with Feature Consistency for Robust and Generalizable Watermarking
by: Li, Yuheng, et al.
Published: (2026)
by: Li, Yuheng, et al.
Published: (2026)
Towards Efficient Pixel Labeling for Industrial Anomaly Detection and Localization
by: Li, Hanxi, et al.
Published: (2024)
by: Li, Hanxi, et al.
Published: (2024)
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
by: Hu, Mu, et al.
Published: (2024)
by: Hu, Mu, et al.
Published: (2024)
PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training
by: Chen, Cong, et al.
Published: (2025)
by: Chen, Cong, et al.
Published: (2025)
On the Robustness of Medical Vision-Language Models: Are they Truly Generalizable?
by: Imam, Raza, et al.
Published: (2025)
by: Imam, Raza, et al.
Published: (2025)
Object-aware Inversion and Reassembly for Image Editing
by: Yang, Zhen, et al.
Published: (2023)
by: Yang, Zhen, et al.
Published: (2023)
Similar Items
-
POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction
by: Zhang, Songyan, et al.
Published: (2025) -
Digging Into Normal Incorporated Stereo Matching
by: Liu, Zihua, et al.
Published: (2024) -
RGM: Reconstructing High-fidelity 3D Car Assets with Relightable 3D-GS Generative Model from a Single Image
by: Chen, Xiaoxue, et al.
Published: (2024) -
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
by: Liu, Yang, et al.
Published: (2023) -
WiseAD: Knowledge Augmented End-to-End Autonomous Driving with Vision-Language Model
by: Zhang, Songyan, et al.
Published: (2024)