:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sun, Wenzhang, Wang, Zhenyu, Hu, Zhangchi, Wang, Chunfeng, Li, Hao, Chen, Wei
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.03028
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Preserve, Reveal, Expand: Faithful 4D Video Editing with Region-Aware Conditioning
by: Hu, Zhangchi, et al.
Published: (2026)

DeCo-VAE: Learning Compact Latents for Video Reconstruction via Decoupled Representation
by: Yin, Xiangchen, et al.
Published: (2025)

PAGS: Priority-Adaptive Gaussian Splatting for Dynamic Driving Scenes
by: A, Ying, et al.
Published: (2025)

DrivingScene: A Multi-Task Online Feed-Forward 3D Gaussian Splatting Method for Dynamic Driving Scenes
by: Hou, Qirui, et al.
Published: (2025)

RiO-DETR: DETR for Real-time Oriented Object Detection
by: Hu, Zhangchi, et al.
Published: (2026)

Envision: Embodied Visual Planning via Goal-Imagery Video Diffusion
by: Gu, Yuming, et al.
Published: (2025)

LoopExpose: An Unsupervised Framework for Arbitrary-Length Exposure Correction
by: Li, Ao, et al.
Published: (2025)

Precision Synthesis of Multi-Tracer PET via VLM-Modulated Rectified Flow for Stratifying Mild Cognitive Impairment
by: Liu, Tuo, et al.
Published: (2026)

AgentSteerTTS: A Multi-Agent Closed-Loop Framework for Composite-Instruction Text-to-Speech
by: Kang, Bin, et al.
Published: (2026)

UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections
by: Cai, Zeyu, et al.
Published: (2025)

MUSE: Manipulating Unified Framework for Synthesizing Emotions in Images via Test-Time Optimization
by: Xia, Yingjie, et al.
Published: (2025)

StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion
by: Tao, Ming, et al.
Published: (2024)

ReDiStory: Region-Disentangled Diffusion for Consistent Visual Story Generation
by: Sarkar, Ayushman, et al.
Published: (2026)

BOOKAGENT: Orchestrating Safety-Aware Visual Narratives via Multi-Agent Cognitive Calibration
by: Gao, Bo, et al.
Published: (2026)

SlimDiffSR: Toward Lightweight and Efficient Remote Sensing Image Super-Resolution via Diffusion Model Distillation
by: Wang, Ce, et al.
Published: (2026)

DASH: 4D Hash Encoding with Self-Supervised Decomposition for Real-Time Dynamic Scene Rendering
by: Chen, Jie, et al.
Published: (2025)

Closed-Loop Bidirectional Prompting for Adversarial Robustness of Vision Language Models
by: Liu, Xiao, et al.
Published: (2026)

Ctrl123: Consistent Novel View Synthesis via Closed-Loop Transcription
by: Zhao, Hongxiang, et al.
Published: (2024)

Dome-DETR: DETR with Density-Oriented Feature-Query Manipulation for Efficient Tiny Object Detection
by: Hu, Zhangchi, et al.
Published: (2025)

Generalizable Sparse-View 3D Reconstruction from Unconstrained Images
by: Gupta, Vinayak, et al.
Published: (2026)

PromptIQA: Boosting the Performance and Generalization for No-Reference Image Quality Assessment via Prompts
by: Chen, Zewen, et al.
Published: (2024)

Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives
by: Feng, Zhangchi, et al.
Published: (2024)

Vision-based Vehicle Re-identification in Bridge Scenario using Flock Similarity
by: Zhang, Chunfeng, et al.
Published: (2024)

MUSE: Multi-Subject Unified Synthesis via Explicit Layout Semantic Expansion
by: Peng, Fei, et al.
Published: (2025)

Envision3D: One Image to 3D with Anchor Views Interpolation
by: Pang, Yatian, et al.
Published: (2024)

UniCP: A Unified Caching and Pruning Framework for Efficient Video Generation
by: Sun, Wenzhang, et al.
Published: (2025)

Hi-VAE: Efficient Video Autoencoding with Global and Detailed Motion
by: Liu, Huaize, et al.
Published: (2025)

UniAvatar: Taming Lifelike Audio-Driven Talking Head Generation with Comprehensive Motion and Lighting Control
by: Sun, Wenzhang, et al.
Published: (2024)

MUSE: Resolving Manifold Misalignment in Visual Tokenization via Topological Orthogonality
by: Yang, Panqi, et al.
Published: (2026)

HiH: A Multi-modal Hierarchy in Hierarchy Network for Unconstrained Gait Recognition
by: Wang, Lei, et al.
Published: (2023)

Unconstrained Multi-view Human Pose Estimation with Algebraic Priors
by: Qin, Xiaolin, et al.
Published: (2026)

DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation
by: He, Jing, et al.
Published: (2024)

DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking and Loop-Closing
by: Qu, Hao, et al.
Published: (2024)

StoryState: Agent-Based State Control for Consistent and Editable Storybooks
by: Sarkar, Ayushman, et al.
Published: (2026)

HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images
by: Yang, Xihe, et al.
Published: (2023)

UGD-IML: A Unified Generative Diffusion-based Framework for Constrained and Unconstrained Image Manipulation Localization
by: Mi, Yachun, et al.
Published: (2025)

ChronoTailor: Harnessing Attention Guidance for Fine-Grained Video Virtual Try-On
by: Wang, Jinjuan, et al.
Published: (2025)

Closed-Loop Transfer for Weakly-supervised Affordance Grounding
by: Tang, Jiajin, et al.
Published: (2025)

AutoLayout: Closed-Loop Layout Synthesis via Slow-Fast Collaborative Reasoning
by: Chen, Weixing, et al.
Published: (2025)

FoldNet: Learning Generalizable Closed-Loop Policy for Garment Folding via Keypoint-Driven Asset and Demonstration Synthesis
by: Chen, Yuxing, et al.
Published: (2025)