Saved in:
| Main Authors: | Cheng, Tao, Chen, Shi-Zhe, Zhang, Hao, Qin, Yixin, Luo, Jinwen, Wei, Zheng |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.20328 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space
by: Zhang, Zeren, et al.
Published: (2024)
by: Zhang, Zeren, et al.
Published: (2024)
DHGS: Decoupled Hybrid Gaussian Splatting for Driving Scene
by: Shi, Xi, et al.
Published: (2024)
by: Shi, Xi, et al.
Published: (2024)
Decoupling Complexity from Scale in Latent Diffusion Model
by: Zhong, Tianxiong, et al.
Published: (2025)
by: Zhong, Tianxiong, et al.
Published: (2025)
Self-Consistent Latent Reasoning: Long Latent Sequence Reasoning for Vision-Language Model
by: Wang, Chenfeng, et al.
Published: (2026)
by: Wang, Chenfeng, et al.
Published: (2026)
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization
by: Zhao, Xiangyu, et al.
Published: (2025)
by: Zhao, Xiangyu, et al.
Published: (2025)
Hierarchical Dual-Subspace Decoupling for Continual Learning in Vision-Language Models
by: Qin, Mengxin, et al.
Published: (2026)
by: Qin, Mengxin, et al.
Published: (2026)
Dual Stream Independence Decoupling for True Emotion Recognition under Masked Expressions
by: Wei, Jinsheng, et al.
Published: (2026)
by: Wei, Jinsheng, et al.
Published: (2026)
MIRG-RL: Multi-Image Reasoning and Grounding with Reinforcement Learning
by: Zheng, Lihao, et al.
Published: (2025)
by: Zheng, Lihao, et al.
Published: (2025)
Heuristic-inspired Reasoning Priors Facilitate Data-Efficient Referring Object Detection
by: Zhang, Xu, et al.
Published: (2026)
by: Zhang, Xu, et al.
Published: (2026)
Latent Visual Reasoning
by: Li, Bangzheng, et al.
Published: (2025)
by: Li, Bangzheng, et al.
Published: (2025)
LDP: Generalizing to Multilingual Visual Information Extraction by Language Decoupled Pretraining
by: Shen, Huawen, et al.
Published: (2024)
by: Shen, Huawen, et al.
Published: (2024)
MorphSeek: Fine-grained Latent Representation-Level Policy Optimization for Deformable Image Registration
by: Zhang, Runxun, et al.
Published: (2025)
by: Zhang, Runxun, et al.
Published: (2025)
Reasoning-Aligned Perception Decoupling for Scalable Multi-modal Reasoning
by: Gou, Yunhao, et al.
Published: (2025)
by: Gou, Yunhao, et al.
Published: (2025)
ZipIR: Latent Pyramid Diffusion Transformer for High-Resolution Image Restoration
by: Yu, Yongsheng, et al.
Published: (2025)
by: Yu, Yongsheng, et al.
Published: (2025)
VTok: A Unified Video Tokenizer with Decoupled Spatial-Temporal Latents
by: Wang, Feng, et al.
Published: (2026)
by: Wang, Feng, et al.
Published: (2026)
SIDeR: Semantic Identity Decoupling for Unrestricted Face Privacy
by: Bao, Zhuosen, et al.
Published: (2026)
by: Bao, Zhuosen, et al.
Published: (2026)
SmartPhotoCrafter: Unified Reasoning, Generation and Optimization for Automatic Photographic Image Editing
by: Zeng, Ying, et al.
Published: (2026)
by: Zeng, Ying, et al.
Published: (2026)
DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization
by: Ding, Zihan, et al.
Published: (2024)
by: Ding, Zihan, et al.
Published: (2024)
Long-Horizon Streaming Video Generation via Hybrid Attention with Decoupled Distillation
by: Li, Ruibin, et al.
Published: (2026)
by: Li, Ruibin, et al.
Published: (2026)
Representation Space Constrained Learning with Modality Decoupling for Multimodal Object Detection
by: Shao, YiKang, et al.
Published: (2025)
by: Shao, YiKang, et al.
Published: (2025)
Scale Decoupled Distillation
by: Luo, Shicai Wei Chunbo Luo Yang
Published: (2024)
by: Luo, Shicai Wei Chunbo Luo Yang
Published: (2024)
PixPerfect: Seamless Latent Diffusion Local Editing with Discriminative Pixel-Space Refinement
by: Zheng, Haitian, et al.
Published: (2025)
by: Zheng, Haitian, et al.
Published: (2025)
BridgeShape: Latent Diffusion Schrödinger Bridge for 3D Shape Completion
by: Kong, Dequan, et al.
Published: (2025)
by: Kong, Dequan, et al.
Published: (2025)
DeX-Portrait: Disentangled and Expressive Portrait Animation via Explicit and Latent Motion Representations
by: Shi, Yuxiang, et al.
Published: (2025)
by: Shi, Yuxiang, et al.
Published: (2025)
Geometric Decoupling: Diagnosing the Structural Instability of Latent
by: Liang, Yuanbang, et al.
Published: (2026)
by: Liang, Yuanbang, et al.
Published: (2026)
Gradient-Guided Modality Decoupling for Missing-Modality Robustness
by: Wang, Hao, et al.
Published: (2024)
by: Wang, Hao, et al.
Published: (2024)
Unlocking the Essence of Beauty: Advanced Aesthetic Reasoning with Relative-Absolute Policy Optimization
by: Liu, Boyang, et al.
Published: (2025)
by: Liu, Boyang, et al.
Published: (2025)
Unpaired Deblurring via Decoupled Diffusion Model
by: Cheng, Junhao, et al.
Published: (2025)
by: Cheng, Junhao, et al.
Published: (2025)
APPO: Attention-guided Perception Policy Optimization for Video Reasoning
by: Du, Henghui, et al.
Published: (2026)
by: Du, Henghui, et al.
Published: (2026)
A Semantic Decoupling-Based Two-Stage Rainy-Day Attack for Revealing Weather Robustness Deficiencies in Vision-Language Models
by: Hu, Chengyin, et al.
Published: (2026)
by: Hu, Chengyin, et al.
Published: (2026)
VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization
by: Li, Yunxin, et al.
Published: (2025)
by: Li, Yunxin, et al.
Published: (2025)
Learning an Efficient Optimizer via Hybrid-Policy Sub-Trajectory Balance
by: Guan, Yunchuan, et al.
Published: (2025)
by: Guan, Yunchuan, et al.
Published: (2025)
Multi-Level Decoupled Relational Distillation for Heterogeneous Architectures
by: Yang, Yaoxin, et al.
Published: (2025)
by: Yang, Yaoxin, et al.
Published: (2025)
Decoupled Distillation to Erase: A General Unlearning Method for Any Class-centric Tasks
by: Zhou, Yu, et al.
Published: (2025)
by: Zhou, Yu, et al.
Published: (2025)
VDNeRF: Vision-only Dynamic Neural Radiance Field for Urban Scenes
by: Zou, Zhengyu, et al.
Published: (2025)
by: Zou, Zhengyu, et al.
Published: (2025)
Walk the Talk: Bridging the Reasoning-Action Gap for Thinking with Images via Multimodal Agentic Policy Optimization
by: Yang, Wenhao, et al.
Published: (2026)
by: Yang, Wenhao, et al.
Published: (2026)
Reading or Reasoning? Format Decoupled Reinforcement Learning for Document OCR
by: Zhong, Yufeng, et al.
Published: (2025)
by: Zhong, Yufeng, et al.
Published: (2025)
From Language to Locomotion: Retargeting-free Humanoid Control via Motion Latent Guidance
by: Li, Zhe, et al.
Published: (2025)
by: Li, Zhe, et al.
Published: (2025)
CAR: Contrast-Agnostic Deformable Medical Image Registration with Contrast-Invariant Latent Regularization
by: Wang, Yinsong, et al.
Published: (2024)
by: Wang, Yinsong, et al.
Published: (2024)
Discovering Pathology Rationale and Token Allocation for Efficient Multimodal Pathology Reasoning
by: Xu, Zhe, et al.
Published: (2025)
by: Xu, Zhe, et al.
Published: (2025)
Similar Items
-
SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space
by: Zhang, Zeren, et al.
Published: (2024) -
DHGS: Decoupled Hybrid Gaussian Splatting for Driving Scene
by: Shi, Xi, et al.
Published: (2024) -
Decoupling Complexity from Scale in Latent Diffusion Model
by: Zhong, Tianxiong, et al.
Published: (2025) -
Self-Consistent Latent Reasoning: Long Latent Sequence Reasoning for Vision-Language Model
by: Wang, Chenfeng, et al.
Published: (2026) -
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization
by: Zhao, Xiangyu, et al.
Published: (2025)