:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Yuan, Liao, Borui, Huang, Huijuan, Lu, Jinda, Li, Ouxiang, Liu, Kuien, Wang, Meng, Wang, Xiang
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2601.04033
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Think, then Score: Decoupled Reasoning and Scoring for Video Reward Modeling
by: Wang, Yuan, et al.
Published: (2026)

Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters
by: Wang, Yuan, et al.
Published: (2024)

Relaxing Anchor-Frame Dominance for Mitigating Hallucinations in Video Large Language Models
by: Liu, Zijian, et al.
Published: (2026)

FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting
by: He, Zefeng, et al.
Published: (2025)

Easier Painting Than Thinking: Can Text-to-Image Models Set the Stage, but Not Direct the Play?
by: Li, Ouxiang, et al.
Published: (2025)

Beyond Where to Look: Trajectory-Guided Reinforcement Learning for Multimodal RLVR
by: Lu, Jinda, et al.
Published: (2026)

When Thinking Hurts: Mitigating Visual Forgetting in Video Reasoning via Frame Repetition
by: Sun, Xiaokun, et al.
Published: (2026)

Frame-Level Captions for Long Video Generation with Complex Multi Scenes
by: Zheng, Guangcong, et al.
Published: (2025)

Self-supervised Learning of Event-guided Video Frame Interpolation for Rolling Shutter Frames
by: Lu, Yunfan, et al.
Published: (2023)

VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction
by: Ji, Longbin, et al.
Published: (2026)

TempoMaster: Efficient Long Video Generation via Next-Frame-Rate Prediction
by: Ma, Yukuo, et al.
Published: (2025)

Detecting AI-Generated Video via Frame Consistency
by: Ma, Long, et al.
Published: (2024)

Autoregressive Video Generation beyond Next Frames Prediction
by: Ren, Sucheng, et al.
Published: (2025)

FrameMind: Frame-Interleaved Video Reasoning via Reinforcement Learning
by: Ge, Haonan, et al.
Published: (2025)

Video Frame Interpolation for Polarization via Swin-Transformer
by: Huang, Feng, et al.
Published: (2024)

VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate
by: Yuan, Zhihang, et al.
Published: (2025)

Rethinking Visual Content Refinement in Low-Shot CLIP Adaptation
by: Lu, Jinda, et al.
Published: (2024)

360VFI: A Dataset and Benchmark for Omnidirectional Video Frame Interpolation
by: Lu, Wenxuan, et al.
Published: (2024)

Perception-Oriented Video Frame Interpolation via Asymmetric Blending
by: Wu, Guangyang, et al.
Published: (2024)

Velocity Disambiguation for Video Frame Interpolation
by: Zhong, Zhihang, et al.
Published: (2023)

Frame by Familiar Frame: Understanding Replication in Video Diffusion Models
by: Rahman, Aimon, et al.
Published: (2024)

Frame-Voyager: Learning to Query Frames for Video Large Language Models
by: Yu, Sicheng, et al.
Published: (2024)

Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
by: Wang, Boyang, et al.
Published: (2025)

Generative Inbetweening through Frame-wise Conditions-Driven Video Generation
by: Zhu, Tianyi, et al.
Published: (2024)

DLFR-VAE: Dynamic Latent Frame Rate VAE for Video Generation
by: Yuan, Zhihang, et al.
Published: (2025)

Motion-aware Latent Diffusion Models for Video Frame Interpolation
by: Huang, Zhilin, et al.
Published: (2024)

Beyond the Last Frame: Process-aware Evaluation for Generative Video Reasoning
by: Li, Yifan, et al.
Published: (2025)

VFIMamba: Video Frame Interpolation with State Space Models
by: Zhang, Guozhen, et al.
Published: (2024)

STORYANCHORS: Generating Consistent Multi-Scene Story Frames for Long-Form Narratives
by: Wang, Bo, et al.
Published: (2025)

InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing
by: Yang, Shaoshu, et al.
Published: (2025)

Sparse Global Matching for Video Frame Interpolation with Large Motion
by: Liu, Chunxu, et al.
Published: (2024)

Benchmarking Video Frame Interpolation
by: Kiefhaber, Simon, et al.
Published: (2024)

Chain-of-Frames: Advancing Video Understanding in Multimodal LLMs via Frame-Aware Reasoning
by: Ghazanfari, Sara, et al.
Published: (2025)

Mamba-FETrack: Frame-Event Tracking via State Space Model
by: Huang, Ju, et al.
Published: (2024)

End-to-End Video Question Answering with Frame Scoring Mechanisms and Adaptive Sampling
by: Liang, Jianxin, et al.
Published: (2024)

FrameBridge: Improving Image-to-Video Generation with Bridge Models
by: Wang, Yuji, et al.
Published: (2024)

Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models
by: Zhang, Lvmin, et al.
Published: (2025)

DreamFrame: Enhancing Video Understanding via Automatically Generated QA and Style-Consistent Keyframes
by: Song, Zhende, et al.
Published: (2024)

DreaMontage: Arbitrary Frame-Guided One-Shot Video Generation
by: Liu, Jiawei, et al.
Published: (2025)

Think-Clip-Sample: Slow-Fast Frame Selection for Video Understanding
by: Tan, Wenhui, et al.
Published: (2026)