:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lan, Guanzhou, Ma, Qianli, Yang, Yuqi, Wang, Zhigang, Wang, Dong, Li, Xuelong, Zhao, Bin
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2410.12346
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Understanding Degradation with Vision Language Model
by: Lan, Guanzhou, et al.
Published: (2026)

Night-to-Day Translation via Illumination Degradation Disentanglement
by: Lan, Guanzhou, et al.
Published: (2024)

Closed-Loop Action Chunks with Dynamic Corrections for Training-Free Diffusion Policy
by: Wu, Pengyuan, et al.
Published: (2026)

Point-PEFT: Parameter-Efficient Fine-Tuning for 3D Pre-trained Models
by: Tang, Yiwen, et al.
Published: (2023)

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations
by: Liu, Junli, et al.
Published: (2025)

Diffusion-guided Generalizable Enhancer for Urban Scene Reconstruction
by: Che, Henry, et al.
Published: (2026)

Detail++: Training-Free Detail Enhancer for Text-to-Image Diffusion Models
by: Chen, Lifeng, et al.
Published: (2025)

MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation
by: Zhang, Pingrui, et al.
Published: (2025)

Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language Navigation
by: Zhang, Pingrui, et al.
Published: (2025)

Sat2Flow: A Structure-Aware Diffusion Framework for Human Flow Generation from Satellite Imagery
by: Wang, Xiangxu, et al.
Published: (2025)

Exploring the Potential of Encoder-free Architectures in 3D LMMs
by: Tang, Yiwen, et al.
Published: (2025)

Exploring Efficient Open-Vocabulary Segmentation in the Remote Sensing
by: Li, Bingyu, et al.
Published: (2025)

Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task Learning
by: Ma, Qianli, et al.
Published: (2024)

DiffusionHarmonizer: Bridging Neural Reconstruction and Photorealistic Simulation with Online Diffusion Enhancer
by: Zhang, Yuxuan, et al.
Published: (2026)

Multi-Knowledge-oriented Nighttime Haze Imaging Enhancer for Vision-driven Intelligent Systems
by: Chen, Ai, et al.
Published: (2025)

Towards Efficient Low-rate Image Compression with Frequency-aware Diffusion Prior Refinement
by: Xia, Yichong, et al.
Published: (2026)

Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding
by: Tang, Yiwen, et al.
Published: (2024)

Diffusion Models in Low-Level Vision: A Survey
by: He, Chunming, et al.
Published: (2024)

MARIS: Marine Open-Vocabulary Instance Segmentation with Geometric Enhancement and Semantic Alignment
by: Li, Bingyu, et al.
Published: (2025)

CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations
by: Zhang, Yuwei, et al.
Published: (2024)

3DGS-Enhancer: Enhancing Unbounded 3D Gaussian Splatting with View-consistent 2D Diffusion Priors
by: Liu, Xi, et al.
Published: (2024)

A Parameter-Efficient Mixture-of-Experts Framework for Cross-Modal Geo-Localization
by: Li, LinFeng, et al.
Published: (2025)

Augmenting Prototype Network with TransMix for Few-shot Hyperspectral Image Classification
by: Liu, Chun, et al.
Published: (2024)

VELoRA: A Low-Rank Adaptation Approach for Efficient RGB-Event based Recognition
by: Chen, Lan, et al.
Published: (2024)

DiffusionReward: Enhancing Blind Face Restoration through Reward Feedback Learning
by: Wu, Bin, et al.
Published: (2025)

Quaternion Generative Adversarial Neural Networks and Applications to Color Image Inpainting
by: Wang, Duan, et al.
Published: (2024)

Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach
by: Wang, Shiao, et al.
Published: (2025)

Degrees of Freedom Matter: Inferring Dynamics from Point Trajectories
by: Zhang, Yan, et al.
Published: (2024)

Enhance Vision-Language Alignment with Noise
by: Huang, Sida, et al.
Published: (2024)

CNN2GNN: How to Bridge CNN with GNN
by: Jiao, Ziheng, et al.
Published: (2024)

Do MLLMs Really See It: Reinforcing Visual Attention in Multimodal LLMs
by: Ou, Siqu, et al.
Published: (2026)

OrthoDiffusion: A Generalizable Multi-Task Diffusion Foundation Model for Musculoskeletal MRI Interpretation
by: Lan, Tian, et al.
Published: (2026)

A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation
by: Liu, Jiacheng, et al.
Published: (2025)

BFA-YOLO: A balanced multiscale object detection network for building façade attachments detection
by: Chen, Yangguang, et al.
Published: (2024)

DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance
by: Shen, Xuan, et al.
Published: (2025)

SDiT: Spiking Diffusion Model with Transformer
by: Yang, Shu, et al.
Published: (2024)

PanoLora: Bridging Perspective and Panoramic Video Generation with LoRA Adaptation
by: Dong, Zeyu, et al.
Published: (2025)

EADReg: Probabilistic Correspondence Generation with Efficient Autoregressive Diffusion Model for Outdoor Point Cloud Registration
by: Gong, Linrui, et al.
Published: (2024)

Exploring the Underwater World Segmentation without Extra Training
by: Li, Bingyu, et al.
Published: (2025)

An Open-Source Benchmark and Baseline for Multi-temporal Referring Segmentation
by: Li, Bingyu, et al.
Published: (2026)