:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cakmak, Mert Can, Agarwal, Nitin, Poudel, Diwash
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2506.19168
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TriPSS: A Tri-Modal Keyframe Extraction Framework Using Perceptual, Structural, and Semantic Representations
by: Cakmak, Mert Can, et al.
Published: (2025)

Investigating Algorithmic Bias in YouTube Shorts
by: Cakmak, Mert Can, et al.
Published: (2025)

A Keyframe-Based Approach for Auditing Bias in YouTube Shorts Recommendations
by: Cakmak, Mert Can, et al.
Published: (2025)

Large Model based Sequential Keyframe Extraction for Video Summarization
by: Tan, Kailong, et al.
Published: (2024)

Controllable Human-centric Keyframe Interpolation with Generative Prior
by: Guo, Zujin, et al.
Published: (2025)

KeyframeFace: Language-Driven Facial Animation via Semantic Keyframes
by: Wu, Jingchao, et al.
Published: (2025)

MomentSeg: Moment-Centric Sampling for Enhanced Video Pixel Understanding
by: Dai, Ming, et al.
Published: (2025)

Human-Centric Transformer for Domain Adaptive Action Recognition
by: Lin, Kun-Yu, et al.
Published: (2024)

KeyGS: A Keyframe-Centric Gaussian Splatting Method for Monocular Image Sequences
by: Chang, Keng-Wei, et al.
Published: (2024)

Object-Centric Framework for Video Moment Retrieval
by: Li, Zongyao, et al.
Published: (2025)

MS-CLR: Multi-Skeleton Contrastive Learning for Human Action Recognition
by: Kiray, Mert, et al.
Published: (2025)

Parameter Efficient Fine-tuning for Domain-specific Gastrointestinal Disease Recognition
by: Poudel, Sanjaya, et al.
Published: (2026)

Placing Human Animations into 3D Scenes by Learning Interaction- and Geometry-Driven Keyframes
by: Mullen Jr, James F., et al.
Published: (2022)

Material Fingerprinting: Identifying and Predicting Perceptual Attributes of Material Appearance
by: Filip, Jiri, et al.
Published: (2024)

PRISM: Streaming Human Motion Generation with Per-Joint Latent Decomposition
by: Ling, Zeyu, et al.
Published: (2026)

Why Sample Space Matters: Keyframe Sampling Optimization for LiDAR-based Place Recognition
by: Stathoulopoulos, Nikolaos, et al.
Published: (2024)

Patch as Node: Human-Centric Graph Representation Learning for Multimodal Action Recognition
by: Liang, Zeyu, et al.
Published: (2025)

KFFocus: Highlighting Keyframes for Enhanced Video Understanding
by: Nie, Ming, et al.
Published: (2025)

Agentic Keyframe Search for Video Question Answering
by: Fan, Sunqi, et al.
Published: (2025)

Aligning Moments in Time using Video Queries
by: Kumar, Yogesh, et al.
Published: (2025)

Do MLLMs Exhibit Human-like Perceptual Behaviors? HVSBench: A Benchmark for MLLM Alignment with Human Perceptual Behavior
by: Lin, Jiaying, et al.
Published: (2024)

Motion Keyframe Interpolation for Any Human Skeleton via Temporally Consistent Point Cloud Sampling and Reconstruction
by: Mo, Clinton, et al.
Published: (2024)

Synthesizing Images on Perceptual Boundaries of ANNs for Uncovering Human Perceptual Variability on Facial Expressions
by: Deng, Haotian, et al.
Published: (2025)

Coarse-to-Fine 3D Keyframe Transporter
by: Zhu, Xupeng, et al.
Published: (2025)

Keyframe-Based Feed-Forward Visual Odometry
by: Dai, Weichen, et al.
Published: (2026)

MANGO: Multimodal Attention-based Normalizing Flow Approach to Fusion Learning
by: Truong, Thanh-Dat, et al.
Published: (2025)

CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition
by: Phung, Quynh, et al.
Published: (2025)

Threading Keyframe with Narratives: MLLMs as Strong Long Video Comprehenders
by: Fang, Bo, et al.
Published: (2025)

Less is More: Improving Motion Diffusion Models with Sparse Keyframes
by: Bae, Jinseok, et al.
Published: (2025)

Range-Agnostic Multi-View Depth Estimation With Keyframe Selection
by: Conti, Andrea, et al.
Published: (2024)

Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation
by: Wang, Xiaojuan, et al.
Published: (2024)

KS-APR: Keyframe Selection for Robust Absolute Pose Regression
by: Liu, Changkun, et al.
Published: (2023)

Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss
by: Kim, Jaeha, et al.
Published: (2024)

Generative Motion Infilling From Imprecisely Timed Keyframes
by: Goel, Purvi, et al.
Published: (2025)

Text-guided 3D Human Motion Generation with Keyframe-based Parallel Skip Transformer
by: Geng, Zichen, et al.
Published: (2024)

VTAgent: Agentic Keyframe Anchoring for Evidence-Aware Video TextVQA
by: He, Haibin, et al.
Published: (2026)

Occlusion-Aware Physics-Semantic Keyframe Selection for Robust Video Editing
by: Liu, Lin, et al.
Published: (2026)

PRISM: Color-Stratified Point Cloud Sampling
by: Lim, Hansol, et al.
Published: (2026)

MultiTSF: Transformer-based Sensor Fusion for Human-Centric Multi-view and Multi-modal Action Recognition
by: Nguyen, Trung Thanh, et al.
Published: (2025)

Decomposing Queries into Tool Calls for Long-Video Keyframe Retrieval
by: Shlapentokh-Rothman, Michal, et al.
Published: (2026)