Saved in:
| Main Authors: | Lan, Guanzhou, Ma, Qianli, Yang, Yuqi, Wang, Zhigang, Wang, Dong, Li, Xuelong, Zhao, Bin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.12346 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Understanding Degradation with Vision Language Model
by: Lan, Guanzhou, et al.
Published: (2026)
by: Lan, Guanzhou, et al.
Published: (2026)
Night-to-Day Translation via Illumination Degradation Disentanglement
by: Lan, Guanzhou, et al.
Published: (2024)
by: Lan, Guanzhou, et al.
Published: (2024)
Closed-Loop Action Chunks with Dynamic Corrections for Training-Free Diffusion Policy
by: Wu, Pengyuan, et al.
Published: (2026)
by: Wu, Pengyuan, et al.
Published: (2026)
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
by: Tang, Yiwen, et al.
Published: (2023)
by: Tang, Yiwen, et al.
Published: (2023)
AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations
by: Liu, Junli, et al.
Published: (2025)
by: Liu, Junli, et al.
Published: (2025)
Diffusion-guided Generalizable Enhancer for Urban Scene Reconstruction
by: Che, Henry, et al.
Published: (2026)
by: Che, Henry, et al.
Published: (2026)
Detail++: Training-Free Detail Enhancer for Text-to-Image Diffusion Models
by: Chen, Lifeng, et al.
Published: (2025)
by: Chen, Lifeng, et al.
Published: (2025)
MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation
by: Zhang, Pingrui, et al.
Published: (2025)
by: Zhang, Pingrui, et al.
Published: (2025)
Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation
by: Zhang, Pingrui, et al.
Published: (2025)
by: Zhang, Pingrui, et al.
Published: (2025)
Sat2Flow: A Structure-Aware Diffusion Framework for Human Flow Generation from Satellite Imagery
by: Wang, Xiangxu, et al.
Published: (2025)
by: Wang, Xiangxu, et al.
Published: (2025)
Exploring the Potential of Encoder-free Architectures in 3D LMMs
by: Tang, Yiwen, et al.
Published: (2025)
by: Tang, Yiwen, et al.
Published: (2025)
Exploring Efficient Open-Vocabulary Segmentation in the Remote Sensing
by: Li, Bingyu, et al.
Published: (2025)
by: Li, Bingyu, et al.
Published: (2025)
Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task Learning
by: Ma, Qianli, et al.
Published: (2024)
by: Ma, Qianli, et al.
Published: (2024)
DiffusionHarmonizer: Bridging Neural Reconstruction and Photorealistic Simulation with Online Diffusion Enhancer
by: Zhang, Yuxuan, et al.
Published: (2026)
by: Zhang, Yuxuan, et al.
Published: (2026)
Multi-Knowledge-oriented Nighttime Haze Imaging Enhancer for Vision-driven Intelligent Systems
by: Chen, Ai, et al.
Published: (2025)
by: Chen, Ai, et al.
Published: (2025)
Towards Efficient Low-rate Image Compression with Frequency-aware Diffusion Prior Refinement
by: Xia, Yichong, et al.
Published: (2026)
by: Xia, Yichong, et al.
Published: (2026)
Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding
by: Tang, Yiwen, et al.
Published: (2024)
by: Tang, Yiwen, et al.
Published: (2024)
Diffusion Models in Low-Level Vision: A Survey
by: He, Chunming, et al.
Published: (2024)
by: He, Chunming, et al.
Published: (2024)
MARIS: Marine Open-Vocabulary Instance Segmentation with Geometric Enhancement and Semantic Alignment
by: Li, Bingyu, et al.
Published: (2025)
by: Li, Bingyu, et al.
Published: (2025)
CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations
by: Zhang, Yuwei, et al.
Published: (2024)
by: Zhang, Yuwei, et al.
Published: (2024)
3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors
by: Liu, Xi, et al.
Published: (2024)
by: Liu, Xi, et al.
Published: (2024)
A Parameter-Efficient Mixture-of-Experts Framework for Cross-Modal Geo-Localization
by: Li, LinFeng, et al.
Published: (2025)
by: Li, LinFeng, et al.
Published: (2025)
Augmenting Prototype Network with TransMix for Few-shot Hyperspectral Image Classification
by: Liu, Chun, et al.
Published: (2024)
by: Liu, Chun, et al.
Published: (2024)
VELoRA: A Low-Rank Adaptation Approach for Efficient RGB-Event based Recognition
by: Chen, Lan, et al.
Published: (2024)
by: Chen, Lan, et al.
Published: (2024)
DiffusionReward: Enhancing Blind Face Restoration through Reward Feedback Learning
by: Wu, Bin, et al.
Published: (2025)
by: Wu, Bin, et al.
Published: (2025)
Quaternion Generative Adversarial Neural Networks and Applications to Color Image Inpainting
by: Wang, Duan, et al.
Published: (2024)
by: Wang, Duan, et al.
Published: (2024)
Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach
by: Wang, Shiao, et al.
Published: (2025)
by: Wang, Shiao, et al.
Published: (2025)
Degrees of Freedom Matter: Inferring Dynamics from Point Trajectories
by: Zhang, Yan, et al.
Published: (2024)
by: Zhang, Yan, et al.
Published: (2024)
Enhance Vision-Language Alignment with Noise
by: Huang, Sida, et al.
Published: (2024)
by: Huang, Sida, et al.
Published: (2024)
CNN2GNN: How to Bridge CNN with GNN
by: Jiao, Ziheng, et al.
Published: (2024)
by: Jiao, Ziheng, et al.
Published: (2024)
Do MLLMs Really See It: Reinforcing Visual Attention in Multimodal LLMs
by: Ou, Siqu, et al.
Published: (2026)
by: Ou, Siqu, et al.
Published: (2026)
OrthoDiffusion: A Generalizable Multi-Task Diffusion Foundation Model for Musculoskeletal MRI Interpretation
by: Lan, Tian, et al.
Published: (2026)
by: Lan, Tian, et al.
Published: (2026)
A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation
by: Liu, Jiacheng, et al.
Published: (2025)
by: Liu, Jiacheng, et al.
Published: (2025)
BFA-YOLO: A balanced multiscale object detection network for building façade attachments detection
by: Chen, Yangguang, et al.
Published: (2024)
by: Chen, Yangguang, et al.
Published: (2024)
DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance
by: Shen, Xuan, et al.
Published: (2025)
by: Shen, Xuan, et al.
Published: (2025)
SDiT: Spiking Diffusion Model with Transformer
by: Yang, Shu, et al.
Published: (2024)
by: Yang, Shu, et al.
Published: (2024)
PanoLora: Bridging Perspective and Panoramic Video Generation with LoRA Adaptation
by: Dong, Zeyu, et al.
Published: (2025)
by: Dong, Zeyu, et al.
Published: (2025)
EADReg: Probabilistic Correspondence Generation with Efficient Autoregressive Diffusion Model for Outdoor Point Cloud Registration
by: Gong, Linrui, et al.
Published: (2024)
by: Gong, Linrui, et al.
Published: (2024)
Exploring the Underwater World Segmentation without Extra Training
by: Li, Bingyu, et al.
Published: (2025)
by: Li, Bingyu, et al.
Published: (2025)
An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation
by: Li, Bingyu, et al.
Published: (2026)
by: Li, Bingyu, et al.
Published: (2026)
Similar Items
-
Understanding Degradation with Vision Language Model
by: Lan, Guanzhou, et al.
Published: (2026) -
Night-to-Day Translation via Illumination Degradation Disentanglement
by: Lan, Guanzhou, et al.
Published: (2024) -
Closed-Loop Action Chunks with Dynamic Corrections for Training-Free Diffusion Policy
by: Wu, Pengyuan, et al.
Published: (2026) -
Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
by: Tang, Yiwen, et al.
Published: (2023) -
AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations
by: Liu, Junli, et al.
Published: (2025)