Saved in:
| Main Authors: | Zhao, Haoyu, Zhang, Zihao, Gu, Jiaxi, Chen, Haoran, Zheng, Qingping, Tang, Pin, Jin, Yeyin, Zhang, Yuang, Cheng, Junqi, Lu, Zenghui, Shu, Peng, Wu, Zuxuan, Jiang, Yu-Gang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.09201 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CameraNoise: Enabling Faithful Camera Control in Video Diffusion through Geometry-Flow-Guided Noise Warping
by: Zhao, Haoyu, et al.
Published: (2026)
by: Zhao, Haoyu, et al.
Published: (2026)
DCDM: Divide-and-Conquer Diffusion Models for Consistency-Preserving Video Generation
by: Zhao, Haoyu, et al.
Published: (2026)
by: Zhao, Haoyu, et al.
Published: (2026)
ShoulderShot: Generating Over-the-Shoulder Dialogue Videos
by: Zhang, Yuang, et al.
Published: (2025)
by: Zhang, Yuang, et al.
Published: (2025)
MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing
by: Zhao, Haoyu, et al.
Published: (2023)
by: Zhao, Haoyu, et al.
Published: (2023)
Repeating Words for Video-Language Retrieval with Coarse-to-Fine Objectives
by: Zhao, Haoyu, et al.
Published: (2025)
by: Zhao, Haoyu, et al.
Published: (2025)
AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
by: Zhang, Xing, et al.
Published: (2024)
by: Zhang, Xing, et al.
Published: (2024)
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
by: Zhang, Yuang, et al.
Published: (2024)
by: Zhang, Yuang, et al.
Published: (2024)
EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation
by: Zhang, Zihao, et al.
Published: (2025)
by: Zhang, Zihao, et al.
Published: (2025)
SpaceMind: Camera-Guided Modality Fusion for Spatial Reasoning in Vision-Language Models
by: Zhao, Ruosen, et al.
Published: (2025)
by: Zhao, Ruosen, et al.
Published: (2025)
Predicting Camera Pose from Perspective Descriptions for Spatial Reasoning
by: Zhang, Xuejun, et al.
Published: (2026)
by: Zhang, Xuejun, et al.
Published: (2026)
Unify Robot Actions in Camera Frame
by: Xie, Sicheng, et al.
Published: (2025)
by: Xie, Sicheng, et al.
Published: (2025)
Hybrid Spiking Vision Transformer for Object Detection with Event Cameras
by: Xu, Qi, et al.
Published: (2025)
by: Xu, Qi, et al.
Published: (2025)
Grounding Actions in Camera Space: Observation-Centric Vision-Language-Action Policy
by: Zhang, Tianyi, et al.
Published: (2025)
by: Zhang, Tianyi, et al.
Published: (2025)
Deep Unrolling Networks with Recurrent Momentum Acceleration for Nonlinear Inverse Problems
by: Zhou, Qingping, et al.
Published: (2023)
by: Zhou, Qingping, et al.
Published: (2023)
MotionMaster: Training-free Camera Motion Transfer For Video Generation
by: Hu, Teng, et al.
Published: (2024)
by: Hu, Teng, et al.
Published: (2024)
Event USKT : U-State Space Model in Knowledge Transfer for Event Cameras
by: Lin, Yuhui, et al.
Published: (2024)
by: Lin, Yuhui, et al.
Published: (2024)
Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models
by: Lu, Tianyi, et al.
Published: (2023)
by: Lu, Tianyi, et al.
Published: (2023)
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
by: He, Hao, et al.
Published: (2024)
by: He, Hao, et al.
Published: (2024)
WHAC: World-grounded Humans and Cameras
by: Yin, Wanqi, et al.
Published: (2024)
by: Yin, Wanqi, et al.
Published: (2024)
CamReasoner: Reinforcing Camera Movement Understanding via Structured Spatial Reasoning
by: Wu, Hang, et al.
Published: (2026)
by: Wu, Hang, et al.
Published: (2026)
Egocentric Gaze Estimation via Neck-Mounted Camera
by: Huang, Haoyu, et al.
Published: (2026)
by: Huang, Haoyu, et al.
Published: (2026)
Paleoinspired Vision: From Exploring Colour Vision Evolution to Inspiring Camera Design
by: Zhang, Junjie, et al.
Published: (2024)
by: Zhang, Junjie, et al.
Published: (2024)
VividCam: Learning Unconventional Camera Motions from Virtual Synthetic Videos
by: Wu, Qiucheng, et al.
Published: (2025)
by: Wu, Qiucheng, et al.
Published: (2025)
Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning
by: Wei, Yana, et al.
Published: (2025)
by: Wei, Yana, et al.
Published: (2025)
Enabling Cross-Camera Collaboration for Video Analytics on Distributed Smart Cameras
by: Min, Chulhong, et al.
Published: (2024)
by: Min, Chulhong, et al.
Published: (2024)
High-speed and High-quality Vision Reconstruction of Spike Camera with Spike Stability Theorem
by: Zhang, Wei, et al.
Published: (2024)
by: Zhang, Wei, et al.
Published: (2024)
Downstream Transfer Attack: Adversarial Attacks on Downstream Models with Pre-trained Vision Transformers
by: Zheng, Weijie, et al.
Published: (2024)
by: Zheng, Weijie, et al.
Published: (2024)
Light-X: Generative 4D Video Rendering with Camera and Illumination Control
by: Liu, Tianqi, et al.
Published: (2025)
by: Liu, Tianqi, et al.
Published: (2025)
EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation
by: Wang, Cong, et al.
Published: (2024)
by: Wang, Cong, et al.
Published: (2024)
Seeing Through Pixel Motion: Learning Obstacle Avoidance from Optical Flow with One Camera
by: Hu, Yu, et al.
Published: (2024)
by: Hu, Yu, et al.
Published: (2024)
CPA: Camera-pose-awareness Diffusion Transformer for Video Generation
by: Wang, Yuelei, et al.
Published: (2024)
by: Wang, Yuelei, et al.
Published: (2024)
Camera Obscura, Camera Lucida
Published: (2010)
Published: (2010)
Adaptive Camera Sensor for Vision Models
by: Baek, Eunsu, et al.
Published: (2025)
by: Baek, Eunsu, et al.
Published: (2025)
Computer Vision with a Superpixelation Camera
by: Mahalingam, Sasidharan, et al.
Published: (2026)
by: Mahalingam, Sasidharan, et al.
Published: (2026)
BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
by: Wang, Yiming, et al.
Published: (2025)
by: Wang, Yiming, et al.
Published: (2025)
DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation
by: Zhao, Haoyu, et al.
Published: (2025)
by: Zhao, Haoyu, et al.
Published: (2025)
FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning
by: Lyu, Weijie, et al.
Published: (2026)
by: Lyu, Weijie, et al.
Published: (2026)
I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength
by: Feng, Wanquan, et al.
Published: (2024)
by: Feng, Wanquan, et al.
Published: (2024)
VLA-Pro: Cross-Task Procedural Memory Transfer for Vision-Language-Action Models
by: Si, Shengyu, et al.
Published: (2026)
by: Si, Shengyu, et al.
Published: (2026)
Adaptive Retention & Correction: Test-Time Training for Continual Learning
by: Chen, Haoran, et al.
Published: (2024)
by: Chen, Haoran, et al.
Published: (2024)
Similar Items
-
CameraNoise: Enabling Faithful Camera Control in Video Diffusion through Geometry-Flow-Guided Noise Warping
by: Zhao, Haoyu, et al.
Published: (2026) -
DCDM: Divide-and-Conquer Diffusion Models for Consistency-Preserving Video Generation
by: Zhao, Haoyu, et al.
Published: (2026) -
ShoulderShot: Generating Over-the-Shoulder Dialogue Videos
by: Zhang, Yuang, et al.
Published: (2025) -
MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing
by: Zhao, Haoyu, et al.
Published: (2023) -
Repeating Words for Video-Language Retrieval with Coarse-to-Fine Objectives
by: Zhao, Haoyu, et al.
Published: (2025)