Saved in:
| Main Authors: | Zhu, Zihao, Huang, Kuan-Ru, Xu, Zhaoming, Li, Renjie, Wu, Bo, Bai, Ruizheng, Wu, Mingyang, Paul, Sayak, Tu, Zhengzhong |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.24762 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VISTA: Generative Visual Imagination for Vision-and-Language Navigation
by: Huang, Yanjia, et al.
Published: (2025)
by: Huang, Yanjia, et al.
Published: (2025)
HeadsUp! High-Fidelity Portrait Image Super-Resolution
by: Li, Renjie, et al.
Published: (2025)
by: Li, Renjie, et al.
Published: (2025)
4KAgent: Agentic Any Image to 4K Super-Resolution
by: Zuo, Yushen, et al.
Published: (2025)
by: Zuo, Yushen, et al.
Published: (2025)
Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing
by: Wang, Hanhui, et al.
Published: (2024)
by: Wang, Hanhui, et al.
Published: (2024)
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding
by: Li, Renjie, et al.
Published: (2025)
by: Li, Renjie, et al.
Published: (2025)
PANDORA: Diffusion Policy Learning for Dexterous Robotic Piano Playing
by: Huang, Yanjia, et al.
Published: (2025)
by: Huang, Yanjia, et al.
Published: (2025)
4K4DGen: Panoramic 4D Generation at 4K Resolution
by: Li, Renjie, et al.
Published: (2024)
by: Li, Renjie, et al.
Published: (2024)
GuideSR: Rethinking Guidance for One-Step High-Fidelity Diffusion-Based Super-Resolution
by: Arora, Aditya, et al.
Published: (2025)
by: Arora, Aditya, et al.
Published: (2025)
VISTAv2: World Imagination for Indoor Vision-and-Language Navigation
by: Huang, Yanjia, et al.
Published: (2025)
by: Huang, Yanjia, et al.
Published: (2025)
ConsID-Gen: View-Consistent and Identity-Preserving Image-to-Video Generation
by: Wu, Mingyang, et al.
Published: (2026)
by: Wu, Mingyang, et al.
Published: (2026)
MWFormer: Multi-Weather Image Restoration Using Degradation-Aware Transformers
by: Zhu, Ruoxi, et al.
Published: (2024)
by: Zhu, Ruoxi, et al.
Published: (2024)
Can Large Pretrained Depth Estimation Models Help With Image Dehazing?
by: Zhang, Hongfei, et al.
Published: (2025)
by: Zhang, Hongfei, et al.
Published: (2025)
FlowSteer: Conditioning Flow Field for Consistent Image Restoration
by: Wickremasinghe, Tharindu, et al.
Published: (2025)
by: Wickremasinghe, Tharindu, et al.
Published: (2025)
SPIRE: Semantic Prompt-Driven Image Restoration
by: Qi, Chenyang, et al.
Published: (2023)
by: Qi, Chenyang, et al.
Published: (2023)
FORGE-Tree: Diffusion-Forcing Tree Search for Long-Horizon Robot Manipulation
by: Huang, Yanjia, et al.
Published: (2025)
by: Huang, Yanjia, et al.
Published: (2025)
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization
by: Xing, Shuo, et al.
Published: (2025)
by: Xing, Shuo, et al.
Published: (2025)
CAST: Contrastive Adaptation and Distillation for Semi-Supervised Instance Segmentation
by: Taghavi, Pardis, et al.
Published: (2025)
by: Taghavi, Pardis, et al.
Published: (2025)
Training a Student Expert via Semi-Supervised Foundation Model Distillation
by: Taghavi, Pardis, et al.
Published: (2026)
by: Taghavi, Pardis, et al.
Published: (2026)
SafeCoop: Unravelling Full Stack Safety in Agentic Collaborative Driving
by: Gao, Xiangbo, et al.
Published: (2025)
by: Gao, Xiangbo, et al.
Published: (2025)
PISCO: Precise Video Instance Insertion with Sparse Control
by: Gao, Xiangbo, et al.
Published: (2026)
by: Gao, Xiangbo, et al.
Published: (2026)
3D4D: An Interactive, Editable, 4D World Model via 3D Video Generation
by: He, Yunhong, et al.
Published: (2025)
by: He, Yunhong, et al.
Published: (2025)
QuantumChem-200K: A Large-Scale Open Organic Molecular Dataset for Quantum-Chemistry Property Screening and Language Model Benchmarking
by: Zeng, Yinqi, et al.
Published: (2025)
by: Zeng, Yinqi, et al.
Published: (2025)
How Independent are Large Language Models? A Statistical Framework for Auditing Behavioral Entanglement and Reweighting Verifier Ensembles
by: Kuai, Chenchen, et al.
Published: (2026)
by: Kuai, Chenchen, et al.
Published: (2026)
The Pulse of Motion: Measuring Physical Frame Rate from Visual Dynamics
by: Gao, Xiangbo, et al.
Published: (2026)
by: Gao, Xiangbo, et al.
Published: (2026)
Q-Router: Agentic Video Quality Assessment with Expert Model Routing and Artifact Localization
by: Xing, Shuo, et al.
Published: (2025)
by: Xing, Shuo, et al.
Published: (2025)
CyPortQA: Benchmarking Multimodal Large Language Models for Cyclone Preparedness in Port Operation
by: Kuai, Chenchen, et al.
Published: (2025)
by: Kuai, Chenchen, et al.
Published: (2025)
SCALES: Boost Binary Neural Network for Image Super-Resolution with Efficient Scalings
by: Wei, Renjie, et al.
Published: (2023)
by: Wei, Renjie, et al.
Published: (2023)
AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark
by: Lin, Li, et al.
Published: (2024)
by: Lin, Li, et al.
Published: (2024)
Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models
by: Zhu, Tinghui, et al.
Published: (2024)
by: Zhu, Tinghui, et al.
Published: (2024)
UAVD4L: A Large-Scale Dataset for UAV 6-DoF Localization
by: Wu, Rouwan, et al.
Published: (2024)
by: Wu, Rouwan, et al.
Published: (2024)
Elucidating and Endowing the Diffusion Training Paradigm for General Image Restoration
by: Lu, Xin, et al.
Published: (2025)
by: Lu, Xin, et al.
Published: (2025)
Distribution-aware Dataset Distillation for Efficient Image Restoration
by: Zheng, Zhuoran, et al.
Published: (2025)
by: Zheng, Zhuoran, et al.
Published: (2025)
Region-R1: Reinforcing Query-Side Region Cropping for Multi-Modal Re-Ranking
by: Hu, Chan-Wei, et al.
Published: (2026)
by: Hu, Chan-Wei, et al.
Published: (2026)
CS-PaperSum: A Large-Scale Dataset of AI-Generated Summaries for Scientific Papers
by: Liu, Javin, et al.
Published: (2025)
by: Liu, Javin, et al.
Published: (2025)
GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration
by: Kong, Xiangtao, et al.
Published: (2026)
by: Kong, Xiangtao, et al.
Published: (2026)
Learning Spectral Diffusion Prior for Hyperspectral Image Reconstruction
by: Yu, Mingyang, et al.
Published: (2025)
by: Yu, Mingyang, et al.
Published: (2025)
Taming Generative Diffusion Prior for Universal Blind Image Restoration
by: Tu, Siwei, et al.
Published: (2024)
by: Tu, Siwei, et al.
Published: (2024)
Background Fades, Foreground Leads: Curriculum-Guided Background Pruning for Efficient Foreground-Centric Collaborative Perception
by: Wu, Yuheng, et al.
Published: (2025)
by: Wu, Yuheng, et al.
Published: (2025)
The Role of Open-Source LLMs in Shaping the Future of GeoAI
by: Huang, Xiao, et al.
Published: (2025)
by: Huang, Xiao, et al.
Published: (2025)
IR-Flow: Bridging Discriminative and Generative Image Restoration via Rectified Flow
by: Fan, Zihao, et al.
Published: (2026)
by: Fan, Zihao, et al.
Published: (2026)
Similar Items
-
VISTA: Generative Visual Imagination for Vision-and-Language Navigation
by: Huang, Yanjia, et al.
Published: (2025) -
HeadsUp! High-Fidelity Portrait Image Super-Resolution
by: Li, Renjie, et al.
Published: (2025) -
4KAgent: Agentic Any Image to 4K Super-Resolution
by: Zuo, Yushen, et al.
Published: (2025) -
Edit Away and My Face Will not Stay: Personal Biometric Defense against Malicious Generative Editing
by: Wang, Hanhui, et al.
Published: (2024) -
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding
by: Li, Renjie, et al.
Published: (2025)