Saved in:
| Main Authors: | Mao, Fangyuan, Wang, Shuo, Mei, Jilin, Lu, Shun, Min, Chen, Liu, Fuyang, Feng, Xiaokun, Wu, Meiqi, Hu, Yu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.15642 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PID: Physics-Informed Diffusion Model for Infrared Image Generation
by: Mao, Fangyuan, et al.
Published: (2024)
by: Mao, Fangyuan, et al.
Published: (2024)
CORENet: Cross-Modal 4D Radar Denoising Network with LiDAR Supervision for Autonomous Driving
by: Liu, Fuyang, et al.
Published: (2025)
by: Liu, Fuyang, et al.
Published: (2025)
MASTER: Multimodal Segmentation with Text Prompts
by: Liu, Fuyang, et al.
Published: (2025)
by: Liu, Fuyang, et al.
Published: (2025)
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
by: Mao, Fangyuan, et al.
Published: (2025)
by: Mao, Fangyuan, et al.
Published: (2025)
Ground4D: Spatially-Grounded Feedforward 4D Reconstruction for Unstructured Off-Road Scenes
by: Wang, Shuo, et al.
Published: (2026)
by: Wang, Shuo, et al.
Published: (2026)
Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models
by: Chen, Chubin, et al.
Published: (2025)
by: Chen, Chubin, et al.
Published: (2025)
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
by: Wu, Meiqi, et al.
Published: (2025)
by: Wu, Meiqi, et al.
Published: (2025)
On Modality Incomplete Infrared-Visible Object Detection: An Architecture Compatibility Perspective
by: Yang, Shuo, et al.
Published: (2025)
by: Yang, Shuo, et al.
Published: (2025)
Beyond Endpoints: Path-Centric Reasoning for Vectorized Off-Road Network Extraction
by: Guan, Wenfei, et al.
Published: (2025)
by: Guan, Wenfei, et al.
Published: (2025)
Towards All-Day Perception for Off-Road Driving: A Large-Scale Multispectral Dataset and Comprehensive Benchmark
by: Wang, Shuo, et al.
Published: (2026)
by: Wang, Shuo, et al.
Published: (2026)
DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM
by: Li, Xuchen, et al.
Published: (2024)
by: Li, Xuchen, et al.
Published: (2024)
From Cross-Modal to Mixed-Modal Visible-Infrared Re-Identification
by: Alehdaghi, Mahdi, et al.
Published: (2025)
by: Alehdaghi, Mahdi, et al.
Published: (2025)
Bridging the Gap: Multi-Level Cross-Modality Joint Alignment for Visible-Infrared Person Re-Identification
by: Liang, Tengfei, et al.
Published: (2023)
by: Liang, Tengfei, et al.
Published: (2023)
InfMAE: A Foundation Model in the Infrared Modality
by: Liu, Fangcen, et al.
Published: (2024)
by: Liu, Fangcen, et al.
Published: (2024)
How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking
by: Li, Xuchen, et al.
Published: (2024)
by: Li, Xuchen, et al.
Published: (2024)
Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark
by: Li, Xuchen, et al.
Published: (2024)
by: Li, Xuchen, et al.
Published: (2024)
DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM
by: Li, Xuchen, et al.
Published: (2024)
by: Li, Xuchen, et al.
Published: (2024)
Dual-level Modality Debiasing Learning for Unsupervised Visible-Infrared Person Re-Identification
by: Li, Jiaze, et al.
Published: (2025)
by: Li, Jiaze, et al.
Published: (2025)
Dynamic Modality-Camera Invariant Clustering for Unsupervised Visible-Infrared Person Re-identification
by: Yang, Yiming, et al.
Published: (2024)
by: Yang, Yiming, et al.
Published: (2024)
FreDFT: Frequency Domain Fusion Transformer for Visible-Infrared Object Detection
by: Wu, Wencong, et al.
Published: (2025)
by: Wu, Wencong, et al.
Published: (2025)
Hyperbolic Cycle Alignment for Infrared-Visible Image Fusion
by: Li, Timing, et al.
Published: (2025)
by: Li, Timing, et al.
Published: (2025)
Latent Temporal Discrepancy as Motion Prior: A Loss-Weighting Strategy for Dynamic Fidelity in T2V
by: Wu, Meiqi, et al.
Published: (2026)
by: Wu, Meiqi, et al.
Published: (2026)
Modality-Transition Representation Learning for Visible-Infrared Person Re-Identification
by: Yuan, Chao, et al.
Published: (2025)
by: Yuan, Chao, et al.
Published: (2025)
Extended Cross-Modality United Learning for Unsupervised Visible-Infrared Person Re-identification
by: Wu, Ruixing, et al.
Published: (2024)
by: Wu, Ruixing, et al.
Published: (2024)
CM-Diff: A Single Generative Network for Bidirectional Cross-Modality Translation Diffusion Model Between Infrared and Visible Images
by: Hu, Bin, et al.
Published: (2025)
by: Hu, Bin, et al.
Published: (2025)
MiPa: Mixed Patch Infrared-Visible Modality Agnostic Object Detection
by: Medeiros, Heitor R., et al.
Published: (2024)
by: Medeiros, Heitor R., et al.
Published: (2024)
Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification
by: Li, Xulin, et al.
Published: (2022)
by: Li, Xulin, et al.
Published: (2022)
VIFNet: An End-to-end Visible-Infrared Fusion Network for Image Dehazing
by: Yu, Meng, et al.
Published: (2024)
by: Yu, Meng, et al.
Published: (2024)
Bridging Human Evaluation to Infrared and Visible Image Fusion
by: Liu, Jinyuan, et al.
Published: (2026)
by: Liu, Jinyuan, et al.
Published: (2026)
HumanAesExpert: Advancing a Multi-Modality Foundation Model for Human Image Aesthetic Assessment
by: Liao, Zhichao, et al.
Published: (2025)
by: Liao, Zhichao, et al.
Published: (2025)
WildOcc: A Benchmark for Off-Road 3D Semantic Occupancy Prediction
by: Zhai, Heng, et al.
Published: (2024)
by: Zhai, Heng, et al.
Published: (2024)
Unified Restoration-Perception Learning: Maritime Infrared-Visible Image Fusion and Segmentation
by: Cai, Weichao, et al.
Published: (2026)
by: Cai, Weichao, et al.
Published: (2026)
Exposing Vulnerabilities in Visible-Infrared VLMs: A Unified Geometric Adversarial Framework with Cross-Task Transferability
by: Chen, Xiang, et al.
Published: (2026)
by: Chen, Xiang, et al.
Published: (2026)
Real-World Adverse Weather Image Restoration via Dual-Level Reinforcement Learning with High-Quality Cold Start
by: Liu, Fuyang, et al.
Published: (2025)
by: Liu, Fuyang, et al.
Published: (2025)
Visible-Infrared Person Re-Identification via Patch-Mixed Cross-Modality Learning
by: Qian, Zhihao, et al.
Published: (2023)
by: Qian, Zhihao, et al.
Published: (2023)
RingMo-Agent: A Unified Remote Sensing Foundation Model for Multi-Platform and Multi-Modal Reasoning
by: Hu, Huiyang, et al.
Published: (2025)
by: Hu, Huiyang, et al.
Published: (2025)
MergeSAM: Unsupervised change detection of remote sensing images based on the Segment Anything Model
by: Hu, Meiqi, et al.
Published: (2025)
by: Hu, Meiqi, et al.
Published: (2025)
Modality-Aware Infrared and Visible Image Fusion with Target-Aware Supervision
by: Sun, Tianyao, et al.
Published: (2025)
by: Sun, Tianyao, et al.
Published: (2025)
Learning Language-Driven Sequence-Level Modal-Invariant Representations for Video-Based Visible-Infrared Person Re-Identification
by: Yang, Xiaomei, et al.
Published: (2026)
by: Yang, Xiaomei, et al.
Published: (2026)
RingMoE: Mixture-of-Modality-Experts Multi-Modal Foundation Models for Universal Remote Sensing Image Interpretation
by: Bi, Hanbo, et al.
Published: (2025)
by: Bi, Hanbo, et al.
Published: (2025)
Similar Items
-
PID: Physics-Informed Diffusion Model for Infrared Image Generation
by: Mao, Fangyuan, et al.
Published: (2024) -
CORENet: Cross-Modal 4D Radar Denoising Network with LiDAR Supervision for Autonomous Driving
by: Liu, Fuyang, et al.
Published: (2025) -
MASTER: Multimodal Segmentation with Text Prompts
by: Liu, Fuyang, et al.
Published: (2025) -
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
by: Mao, Fangyuan, et al.
Published: (2025) -
Ground4D: Spatially-Grounded Feedforward 4D Reconstruction for Unstructured Off-Road Scenes
by: Wang, Shuo, et al.
Published: (2026)