:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Kim, Jihwan, Parthasarathy, Nikhil, Qin, Danfeng, Hur, Junhwa, Sun, Deqing, Han, Bohyung, Yang, Ming-Hsuan, Gong, Boqing
Formato:	Preprint
Publicado:	2026
Materias:	Computer Vision and Pattern Recognition
Acceso en línea:	https://arxiv.org/abs/2605.17260
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

High-Resolution Frame Interpolation with Patch-based Cascaded Diffusion
por: Hur, Junhwa, et al.
Publicado: (2024)

FIFO-Diffusion: Generating Infinite Videos from Text without Training
por: Kim, Jihwan, et al.
Publicado: (2024)

Telling Left from Right: Identifying Geometry-Aware Semantic Correspondence
por: Zhang, Junyi, et al.
Publicado: (2023)

Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs
por: Kim, Minji, et al.
Publicado: (2025)

MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion
por: Zhang, Junyi, et al.
Publicado: (2024)

LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory
por: Zhang, Junyi, et al.
Publicado: (2026)

Video Parallel Scaling: Aggregating Diverse Frame Subsets for VideoLLMs
por: Chung, Hyungjin, et al.
Publicado: (2025)

GeCo: Evaluating Geometric Consistency for Video Generation via Motion and Structure
por: Gu, Leslie, et al.
Publicado: (2025)

Restoration-Oriented Video Frame Interpolation with Region-Distinguishable Priors from SAM
por: Han, Yan, et al.
Publicado: (2023)

The LLM Bottleneck: Why Open-Source Vision LLMs Struggle with Hierarchical Visual Recognition
por: Tan, Yuwen, et al.
Publicado: (2025)

Boundary Attention: Learning curves, corners, junctions and grouping
por: Polansky, Mia Gaia, et al.
Publicado: (2024)

New Tight Wavelet Frame Constructions Sharing Responsibility
por: Hur, Youngmi, et al.
Publicado: (2024)

UFO-4D: Unposed Feedforward 4D Reconstruction from Two Images
por: Hur, Junhwa, et al.
Publicado: (2026)

Multimodal Alignment with Cross-Attentive GRUs for Fine-Grained Video Understanding
por: Kim, Namho, et al.
Publicado: (2025)

Communication-Efficient Federated Learning with Accelerated Client Gradient
por: Kim, Geeho, et al.
Publicado: (2022)

Leveraging Temporal Contextualization for Video Action Recognition
por: Kim, Minji, et al.
Publicado: (2024)

Image Diffusion Preview with Consistency Solver
por: Wang, Fu-Yun, et al.
Publicado: (2025)

GP-4DGS: Probabilistic 4D Gaussian Splatting from Monocular Video via Variational Gaussian Processes
por: Kim, Mijeong, et al.
Publicado: (2026)

Emergent Temporal Correspondences from Video Diffusion Transformers
por: Nam, Jisu, et al.
Publicado: (2025)

Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs
por: Zhang, Shaojie, et al.
Publicado: (2025)

Chain-of-Frames: Advancing Video Understanding in Multimodal LLMs via Frame-Aware Reasoning
por: Ghazanfari, Sara, et al.
Publicado: (2025)

MASIV: Toward Material-Agnostic System Identification from Videos
por: Zhao, Yizhou, et al.
Publicado: (2025)

Re-evaluating Group Robustness via Adaptive Class-Specific Scaling
por: Seo, Seonguk, et al.
Publicado: (2024)

TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking
por: Nam, Jisu, et al.
Publicado: (2026)

Motion-Aware Video Frame Interpolation
por: Han, Pengfei, et al.
Publicado: (2024)

LADDER: An Efficient Framework for Video Frame Interpolation
por: Shen, Tong, et al.
Publicado: (2024)

Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions
por: Hur, Chan, et al.
Publicado: (2025)

VideoPrism: A Foundational Visual Encoder for Video Understanding
por: Zhao, Long, et al.
Publicado: (2024)

Tinted Frames: Question Framing Blinds Vision-Language Models
por: Fan, Wan-Cyuan, et al.
Publicado: (2026)

Frame Scaling by Graphs
por: K, Ayyanar, et al.
Publicado: (2024)

Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models
por: Jang, Sangwon, et al.
Publicado: (2025)

Frame by Familiar Frame: Understanding Replication in Video Diffusion Models
por: Rahman, Aimon, et al.
Publicado: (2024)

Beyond World-Frame Action Heads: Motion-Centric Action Frames for Vision-Language-Action Models
por: Yang, Huoren, et al.
Publicado: (2026)

Frame by Frame
por: Frank, Hannah
Publicado: (2019)

Frame by Frame
por: Frank, Hannah
Publicado: (2020)

Do LLMs Encode Frame Semantics? Evidence from Frame Identification
por: Chundru, Jayanth Krishna, et al.
Publicado: (2025)

Frame-Difference Guided Dynamic Region Perception for CLIP Adaptation in Text-Video Retrieval
por: Yu, Jiaao, et al.
Publicado: (2025)

A Simple Recipe for Contrastively Pre-training Video-First Encoders Beyond 16 Frames
por: Papalampidi, Pinelopi, et al.
Publicado: (2023)

Seeing Beyond Frames: Zero-Shot Pedestrian Intention Prediction with Raw Temporal Video and Multimodal Cues
por: Zambare, Pallavi, et al.
Publicado: (2025)

Enhancing Video Inpainting with Aligned Frame Interval Guidance
por: Xie, Ming, et al.
Publicado: (2025)