Saved in:
| Main Authors: | Zhang, Wenyu, Tong, Yao, Liu, Yiqiu, Cao, Rui |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.13507 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Mapping Urban Villages in China: Progress and Challenges
by: Cao, Rui, et al.
Published: (2025)
by: Cao, Rui, et al.
Published: (2025)
Vision-Based Localization in Dense Urban Environments: A Case Study of an Urban Village in China
by: Wu, Menglin, et al.
Published: (2026)
by: Wu, Menglin, et al.
Published: (2026)
UV-SAM: Adapting Segment Anything Model for Urban Village Identification
by: Zhang, Xin, et al.
Published: (2024)
by: Zhang, Xin, et al.
Published: (2024)
FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification
by: Yao, Jingfeng, et al.
Published: (2024)
by: Yao, Jingfeng, et al.
Published: (2024)
ViTGaze: Gaze Following with Interaction Features in Vision Transformers
by: Song, Yuehao, et al.
Published: (2024)
by: Song, Yuehao, et al.
Published: (2024)
Building and Road Recognition in Dense Urban Informal Settlements: A Dataset and Benchmark
by: Long, Hongyu, et al.
Published: (2026)
by: Long, Hongyu, et al.
Published: (2026)
Urban Scene Diffusion through Semantic Occupancy Map
by: Zhang, Junge, et al.
Published: (2024)
by: Zhang, Junge, et al.
Published: (2024)
Recurrence-based Vanishing Point Detection
by: Bharadwaj, Skanda, et al.
Published: (2024)
by: Bharadwaj, Skanda, et al.
Published: (2024)
DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches
by: Xing, Yun, et al.
Published: (2025)
by: Xing, Yun, et al.
Published: (2025)
PersonViT: Large-scale Self-supervised Vision Transformer for Person Re-Identification
by: Hu, Bin, et al.
Published: (2024)
by: Hu, Bin, et al.
Published: (2024)
Bridging the Gap Between Sparsity and Redundancy: A Dual-Decoding Framework with Global Context for Map Inference
by: Shen, Yudong, et al.
Published: (2025)
by: Shen, Yudong, et al.
Published: (2025)
MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling
by: Li, Yingyue, et al.
Published: (2025)
by: Li, Yingyue, et al.
Published: (2025)
Vanish into Thin Air: Cross-prompt Universal Adversarial Attacks for SAM2
by: Zhou, Ziqi, et al.
Published: (2025)
by: Zhou, Ziqi, et al.
Published: (2025)
HiT: Building Mapping with Hierarchical Transformers
by: Zhang, Mingming, et al.
Published: (2023)
by: Zhang, Mingming, et al.
Published: (2023)
Improving Depth Gradient Continuity in Transformers: A Comparative Study on Monocular Depth Estimation with CNN
by: Yao, Jiawei, et al.
Published: (2023)
by: Yao, Jiawei, et al.
Published: (2023)
MapTRv2: An End-to-End Framework for Online Vectorized HD Map Construction
by: Liao, Bencheng, et al.
Published: (2023)
by: Liao, Bencheng, et al.
Published: (2023)
Matte Anything: Interactive Natural Image Matting with Segment Anything Models
by: Yao, Jingfeng, et al.
Published: (2023)
by: Yao, Jingfeng, et al.
Published: (2023)
HGS-Mapping: Online Dense Mapping Using Hybrid Gaussian Representation in Urban Scenes
by: Wu, Ke, et al.
Published: (2024)
by: Wu, Ke, et al.
Published: (2024)
Hyper-Local Deformable Transformers for Text Spotting on Historical Maps
by: Lin, Yijun, et al.
Published: (2025)
by: Lin, Yijun, et al.
Published: (2025)
Automated National Urban Map Extraction
by: Nasrallah, Hasan, et al.
Published: (2024)
by: Nasrallah, Hasan, et al.
Published: (2024)
Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World
by: Liao, Bangyan, et al.
Published: (2025)
by: Liao, Bangyan, et al.
Published: (2025)
Gradually Vanishing Gap in Prototypical Network for Unsupervised Domain Adaptation
by: Wang, Shanshan, et al.
Published: (2024)
by: Wang, Shanshan, et al.
Published: (2024)
UV-Mamba: A DCN-Enhanced State Space Model for Urban Village Boundary Identification in High-Resolution Remote Sensing Images
by: Li, Lulin, et al.
Published: (2024)
by: Li, Lulin, et al.
Published: (2024)
From Street View to Visual Network: Mapping the Visibility of Urban Landmarks with Vision-Language Models
by: Fan, Zicheng, et al.
Published: (2025)
by: Fan, Zicheng, et al.
Published: (2025)
Uncertainty-Aware Gaussian Map for Vision-Language Navigation
by: Gao, Jianzhe, et al.
Published: (2026)
by: Gao, Jianzhe, et al.
Published: (2026)
Urban Neural Surface Reconstruction from Constrained Sparse Aerial Imagery with 3D SAR Fusion
by: Li, Da, et al.
Published: (2026)
by: Li, Da, et al.
Published: (2026)
Take A Shortcut Back: Mitigating the Gradient Vanishing for Training Spiking Neural Networks
by: Guo, Yufei, et al.
Published: (2024)
by: Guo, Yufei, et al.
Published: (2024)
ImagineMap: Enhanced HD Map Construction with SD Maps
by: Ji, Yishen, et al.
Published: (2024)
by: Ji, Yishen, et al.
Published: (2024)
Turbo-VAED: Fast and Stable Transfer of Video-VAEs to Mobile Devices
by: Zou, Ya, et al.
Published: (2025)
by: Zou, Ya, et al.
Published: (2025)
Visible and Clear: Finding Tiny Objects in Difference Map
by: Cao, Bing, et al.
Published: (2024)
by: Cao, Bing, et al.
Published: (2024)
From Drone Imagery to Livability Mapping: AI-powered Environment Perception in Rural China
by: Deng, Weihuan, et al.
Published: (2025)
by: Deng, Weihuan, et al.
Published: (2025)
DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation
by: Cao, Ziang, et al.
Published: (2024)
by: Cao, Ziang, et al.
Published: (2024)
WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation
by: Zhu, Lianghui, et al.
Published: (2023)
by: Zhu, Lianghui, et al.
Published: (2023)
RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer
by: Lv, Wenyu, et al.
Published: (2024)
by: Lv, Wenyu, et al.
Published: (2024)
VAD-GS: Visibility-Aware Densification for 3D Gaussian Splatting in Dynamic Urban Scenes
by: Zhang, Yikang, et al.
Published: (2025)
by: Zhang, Yikang, et al.
Published: (2025)
WeatherCity: Urban Scene Reconstruction with Controllable Multi-Weather Transformation
by: Wu, Wenhua, et al.
Published: (2026)
by: Wu, Wenhua, et al.
Published: (2026)
From Pixels to People: Satellite-Based Mapping and Quantification of Riverbank Erosion and Lost Villages in Bangladesh
by: Rafat, M Saifuzzaman, et al.
Published: (2025)
by: Rafat, M Saifuzzaman, et al.
Published: (2025)
Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes
by: Guo, Diandian, et al.
Published: (2024)
by: Guo, Diandian, et al.
Published: (2024)
GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding
by: Jiang, Haoyi, et al.
Published: (2024)
by: Jiang, Haoyi, et al.
Published: (2024)
MolSight: Optical Chemical Structure Recognition with SMILES Pretraining, Multi-Granularity Learning and Reinforcement Learning
by: Zhang, Wenrui, et al.
Published: (2025)
by: Zhang, Wenrui, et al.
Published: (2025)
Similar Items
-
Mapping Urban Villages in China: Progress and Challenges
by: Cao, Rui, et al.
Published: (2025) -
Vision-Based Localization in Dense Urban Environments: A Case Study of an Urban Village in China
by: Wu, Menglin, et al.
Published: (2026) -
UV-SAM: Adapting Segment Anything Model for Urban Village Identification
by: Zhang, Xin, et al.
Published: (2024) -
FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification
by: Yao, Jingfeng, et al.
Published: (2024) -
ViTGaze: Gaze Following with Interaction Features in Vision Transformers
by: Song, Yuehao, et al.
Published: (2024)