:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xia, Linhan, Liu, Junbang, Wu, Tong
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2403.01370
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Improving Depth Gradient Continuity in Transformers: A Comparative Study on Monocular Depth Estimation with CNN
by: Yao, Jiawei, et al.
Published: (2023)

UniCT Depth: Event-Image Fusion Based Monocular Depth Estimation with Convolution-Compensated ViT Dual SA Block
by: Jing, Luoxi, et al.
Published: (2025)

Unsupervised Monocular Depth Estimation Based on Hierarchical Feature-Guided Diffusion
by: Liu, Runze, et al.
Published: (2024)

3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining
by: Yan, Siming, et al.
Published: (2023)

RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale
by: Li, Han, et al.
Published: (2024)

Facial Affect Recognition based on Multi Architecture Encoder and Feature Fusion for the ABAW7 Challenge
by: Shen, Kang, et al.
Published: (2024)

360Recon: An Accurate Reconstruction Method Based on Depth Fusion from 360 Images
by: Yan, Zhongmiao, et al.
Published: (2024)

DepthFusion: Depth-Aware Hybrid Feature Fusion for LiDAR-Camera 3D Object Detection
by: Ji, Mingqian, et al.
Published: (2025)

OccFusion: Depth Estimation Free Multi-sensor Fusion for 3D Occupancy Prediction
by: Zhang, Ji, et al.
Published: (2024)

FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation
by: Bai, Yunpeng, et al.
Published: (2024)

SphereFusion: Efficient Panorama Depth Estimation via Gated Fusion
by: Yan, Qingsong, et al.
Published: (2025)

Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
by: Chen, Jiuhai, et al.
Published: (2024)

A Cross-Hierarchical Difference Feature Fusion Network Based on Multiscale Encoder-Decoder for Hyperspectral Change Detection
by: Sheng, Mingshuai, et al.
Published: (2025)

TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion
by: Wang, Yiran, et al.
Published: (2025)

Improving Video Diffusion Transformer Training by Multi-Feature Fusion and Alignment from Self-Supervised Vision Encoders
by: Lee, Dohun, et al.
Published: (2025)

Monocular Depth Estimation and Segmentation for Transparent Object with Iterative Semantic and Geometric Fusion
by: Liu, Jiangyuan, et al.
Published: (2025)

Enhanced Encoder-Decoder Architecture for Accurate Monocular Depth Estimation
by: Das, Dabbrata, et al.
Published: (2024)

Fusion to Enhance: Fusion Visual Encoder to Enhance Multimodal Language Model
by: She, Yifei, et al.
Published: (2025)

EndoDepthL: Lightweight Endoscopic Monocular Depth Estimation with CNN-Transformer
by: Li, Yangke
Published: (2023)

Hierarchical Awareness Adapters with Hybrid Pyramid Feature Fusion for Dense Depth Prediction
by: Su, Wuqi, et al.
Published: (2026)

The Devil is in the Edges: Monocular Depth Estimation with Edge-aware Consistency Fusion
by: Li, Pengzhi, et al.
Published: (2024)

Uni-Encoder Meets Multi-Encoders: Representation Before Fusion for Brain Tumor Segmentation with Missing Modalities
by: Song, Peibo, et al.
Published: (2026)

ExFusion: Efficient Transformer Training via Multi-Experts Fusion
by: Ruan, Jiacheng, et al.
Published: (2026)

ContrastiveGaussian: High-Fidelity 3D Generation with Contrastive Learning and Gaussian Splatting
by: Liu, Junbang, et al.
Published: (2025)

ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation
by: Zhu, Ruijie, et al.
Published: (2024)

Cross-Modal RGB-D Fusion Transformer for 6D Pose Estimation of Non-Cooperative Spacecraft with Stereo-Derived Depth
by: Zhen, Yongliang, et al.
Published: (2026)

Multispectral Detection Transformer with Infrared-Centric Feature Fusion
by: Hwang, Seongmin, et al.
Published: (2025)

BRIDGE -- Building Reinforcement-Learning Depth-to-Image Data Generation Engine for Monocular Depth Estimation
by: Liu, Dingning, et al.
Published: (2025)

Federated Modality-specific Encoders and Partially Personalized Fusion Decoder for Multimodal Brain Tumor Segmentation
by: Liu, Hong, et al.
Published: (2026)

GlyphMastero: A Glyph Encoder for High-Fidelity Scene Text Editing
by: Wang, Tong, et al.
Published: (2025)

Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective
by: Shen, Zhijie, et al.
Published: (2024)

AnyDepth: Depth Estimation Made Easy
by: Ren, Zeyu, et al.
Published: (2026)

Mask-adaptive Gated Convolution and Bi-directional Progressive Fusion Network for Depth Completion
by: Huang, Tingxuan, et al.
Published: (2024)

Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion
by: Sun, Hongze, et al.
Published: (2024)

Multi-Grained Feature Pruning for Video-Based Human Pose Estimation
by: Wang, Zhigang, et al.
Published: (2025)

Depth as Points: Center Point-based Depth Estimation
by: Tu, Zhiheng, et al.
Published: (2025)

TUNI: Real-time RGB-T Semantic Segmentation with Unified Multi-Modal Feature Extraction and Cross-Modal Feature Fusion
by: Guo, Xiaodong, et al.
Published: (2025)

PathFusion: Path-consistent Lidar-Camera Deep Feature Fusion
by: Wu, Lemeng, et al.
Published: (2022)

HTMNet: A Hybrid Network with Transformer-Mamba Bottleneck Multimodal Fusion for Transparent and Reflective Objects Depth Completion
by: Xie, Guanghu, et al.
Published: (2025)

Scalable Autoregressive Monocular Depth Estimation
by: Wang, Jinhong, et al.
Published: (2024)