:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Mao, Fangyuan, Wang, Shuo, Mei, Jilin, Lu, Shun, Min, Chen, Liu, Fuyang, Feng, Xiaokun, Wu, Meiqi, Hu, Yu
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2509.15642
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

PID: Physics-Informed Diffusion Model for Infrared Image Generation
by: Mao, Fangyuan, et al.
Published: (2024)

CORENet: Cross-Modal 4D Radar Denoising Network with LiDAR Supervision for Autonomous Driving
by: Liu, Fuyang, et al.
Published: (2025)

MASTER: Multimodal Segmentation with Text Prompts
by: Liu, Fuyang, et al.
Published: (2025)

Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
by: Mao, Fangyuan, et al.
Published: (2025)

Ground4D: Spatially-Grounded Feedforward 4D Reconstruction for Unstructured Off-Road Scenes
by: Wang, Shuo, et al.
Published: (2026)

Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models
by: Chen, Chubin, et al.
Published: (2025)

ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
by: Wu, Meiqi, et al.
Published: (2025)

On Modality Incomplete Infrared-Visible Object Detection: An Architecture Compatibility Perspective
by: Yang, Shuo, et al.
Published: (2025)

Beyond Endpoints: Path-Centric Reasoning for Vectorized Off-Road Network Extraction
by: Guan, Wenfei, et al.
Published: (2025)

Towards All-Day Perception for Off-Road Driving: A Large-Scale Multispectral Dataset and Comprehensive Benchmark
by: Wang, Shuo, et al.
Published: (2026)

DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM
by: Li, Xuchen, et al.
Published: (2024)

From Cross-Modal to Mixed-Modal Visible-Infrared Re-Identification
by: Alehdaghi, Mahdi, et al.
Published: (2025)

Bridging the Gap: Multi-Level Cross-Modality Joint Alignment for Visible-Infrared Person Re-Identification
by: Liang, Tengfei, et al.
Published: (2023)

InfMAE: A Foundation Model in the Infrared Modality
by: Liu, Fangcen, et al.
Published: (2024)

How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking
by: Li, Xuchen, et al.
Published: (2024)

Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark
by: Li, Xuchen, et al.
Published: (2024)

DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM
by: Li, Xuchen, et al.
Published: (2024)

Dual-level Modality Debiasing Learning for Unsupervised Visible-Infrared Person Re-Identification
by: Li, Jiaze, et al.
Published: (2025)

Dynamic Modality-Camera Invariant Clustering for Unsupervised Visible-Infrared Person Re-identification
by: Yang, Yiming, et al.
Published: (2024)

FreDFT: Frequency Domain Fusion Transformer for Visible-Infrared Object Detection
by: Wu, Wencong, et al.
Published: (2025)

Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion
by: Li, Timing, et al.
Published: (2025)

Latent Temporal Discrepancy as Motion Prior: A Loss-Weighting Strategy for Dynamic Fidelity in T2V
by: Wu, Meiqi, et al.
Published: (2026)

Modality-Transition Representation Learning for Visible-Infrared Person Re-Identification
by: Yuan, Chao, et al.
Published: (2025)

Extended Cross-Modality United Learning for Unsupervised Visible-Infrared Person Re-identification
by: Wu, Ruixing, et al.
Published: (2024)

CM-Diff: A Single Generative Network for Bidirectional Cross-Modality Translation Diffusion Model Between Infrared and Visible Images
by: Hu, Bin, et al.
Published: (2025)

MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection
by: Medeiros, Heitor R., et al.
Published: (2024)

Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification
by: Li, Xulin, et al.
Published: (2022)

VIFNet: An End-to-end Visible-Infrared Fusion Network for Image Dehazing
by: Yu, Meng, et al.
Published: (2024)

Bridging Human Evaluation to Infrared and Visible Image Fusion
by: Liu, Jinyuan, et al.
Published: (2026)

HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment
by: Liao, Zhichao, et al.
Published: (2025)

WildOcc: A Benchmark for Off-Road 3D Semantic Occupancy Prediction
by: Zhai, Heng, et al.
Published: (2024)

Unified Restoration-Perception Learning: Maritime Infrared-Visible Image Fusion and Segmentation
by: Cai, Weichao, et al.
Published: (2026)

Exposing Vulnerabilities in Visible-Infrared VLMs: A Unified Geometric Adversarial Framework with Cross-Task Transferability
by: Chen, Xiang, et al.
Published: (2026)

Real-World Adverse Weather Image Restoration via Dual-Level Reinforcement Learning with High-Quality Cold Start
by: Liu, Fuyang, et al.
Published: (2025)

Visible-Infrared Person Re-Identification via Patch-Mixed Cross-Modality Learning
by: Qian, Zhihao, et al.
Published: (2023)

RingMo-Agent: A Unified Remote Sensing Foundation Model for Multi-Platform and Multi-Modal Reasoning
by: Hu, Huiyang, et al.
Published: (2025)

MergeSAM: Unsupervised change detection of remote sensing images based on the Segment Anything Model
by: Hu, Meiqi, et al.
Published: (2025)

Modality-Aware Infrared and Visible Image Fusion with Target-Aware Supervision
by: Sun, Tianyao, et al.
Published: (2025)

Learning Language-Driven Sequence-Level Modal-Invariant Representations for Video-Based Visible-Infrared Person Re-Identification
by: Yang, Xiaomei, et al.
Published: (2026)

RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation
by: Bi, Hanbo, et al.
Published: (2025)