Saved in:
| Main Authors: | Cakmak, Mert Can, Agarwal, Nitin, Poudel, Diwash |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.19168 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
TriPSS: A Tri-Modal Keyframe Extraction Framework Using Perceptual, Structural, and Semantic Representations
by: Cakmak, Mert Can, et al.
Published: (2025)
by: Cakmak, Mert Can, et al.
Published: (2025)
Investigating Algorithmic Bias in YouTube Shorts
by: Cakmak, Mert Can, et al.
Published: (2025)
by: Cakmak, Mert Can, et al.
Published: (2025)
A Keyframe-Based Approach for Auditing Bias in YouTube Shorts Recommendations
by: Cakmak, Mert Can, et al.
Published: (2025)
by: Cakmak, Mert Can, et al.
Published: (2025)
Large Model based Sequential Keyframe Extraction for Video Summarization
by: Tan, Kailong, et al.
Published: (2024)
by: Tan, Kailong, et al.
Published: (2024)
Controllable Human-centric Keyframe Interpolation with Generative Prior
by: Guo, Zujin, et al.
Published: (2025)
by: Guo, Zujin, et al.
Published: (2025)
KeyframeFace: Language-Driven Facial Animation via Semantic Keyframes
by: Wu, Jingchao, et al.
Published: (2025)
by: Wu, Jingchao, et al.
Published: (2025)
MomentSeg: Moment-Centric Sampling for Enhanced Video Pixel Understanding
by: Dai, Ming, et al.
Published: (2025)
by: Dai, Ming, et al.
Published: (2025)
Human-Centric Transformer for Domain Adaptive Action Recognition
by: Lin, Kun-Yu, et al.
Published: (2024)
by: Lin, Kun-Yu, et al.
Published: (2024)
KeyGS: A Keyframe-Centric Gaussian Splatting Method for Monocular Image Sequences
by: Chang, Keng-Wei, et al.
Published: (2024)
by: Chang, Keng-Wei, et al.
Published: (2024)
Object-Centric Framework for Video Moment Retrieval
by: Li, Zongyao, et al.
Published: (2025)
by: Li, Zongyao, et al.
Published: (2025)
MS-CLR: Multi-Skeleton Contrastive Learning for Human Action Recognition
by: Kiray, Mert, et al.
Published: (2025)
by: Kiray, Mert, et al.
Published: (2025)
Parameter Efficient Fine-tuning for Domain-specific Gastrointestinal Disease Recognition
by: Poudel, Sanjaya, et al.
Published: (2026)
by: Poudel, Sanjaya, et al.
Published: (2026)
Placing Human Animations into 3D Scenes by Learning Interaction- and Geometry-Driven Keyframes
by: Mullen Jr, James F., et al.
Published: (2022)
by: Mullen Jr, James F., et al.
Published: (2022)
Material Fingerprinting: Identifying and Predicting Perceptual Attributes of Material Appearance
by: Filip, Jiri, et al.
Published: (2024)
by: Filip, Jiri, et al.
Published: (2024)
PRISM: Streaming Human Motion Generation with Per-Joint Latent Decomposition
by: Ling, Zeyu, et al.
Published: (2026)
by: Ling, Zeyu, et al.
Published: (2026)
Why Sample Space Matters: Keyframe Sampling Optimization for LiDAR-based Place Recognition
by: Stathoulopoulos, Nikolaos, et al.
Published: (2024)
by: Stathoulopoulos, Nikolaos, et al.
Published: (2024)
Patch as Node: Human-Centric Graph Representation Learning for Multimodal Action Recognition
by: Liang, Zeyu, et al.
Published: (2025)
by: Liang, Zeyu, et al.
Published: (2025)
KFFocus: Highlighting Keyframes for Enhanced Video Understanding
by: Nie, Ming, et al.
Published: (2025)
by: Nie, Ming, et al.
Published: (2025)
Agentic Keyframe Search for Video Question Answering
by: Fan, Sunqi, et al.
Published: (2025)
by: Fan, Sunqi, et al.
Published: (2025)
Aligning Moments in Time using Video Queries
by: Kumar, Yogesh, et al.
Published: (2025)
by: Kumar, Yogesh, et al.
Published: (2025)
Do MLLMs Exhibit Human-like Perceptual Behaviors? HVSBench: A Benchmark for MLLM Alignment with Human Perceptual Behavior
by: Lin, Jiaying, et al.
Published: (2024)
by: Lin, Jiaying, et al.
Published: (2024)
Motion Keyframe Interpolation for Any Human Skeleton via Temporally Consistent Point Cloud Sampling and Reconstruction
by: Mo, Clinton, et al.
Published: (2024)
by: Mo, Clinton, et al.
Published: (2024)
Synthesizing Images on Perceptual Boundaries of ANNs for Uncovering Human Perceptual Variability on Facial Expressions
by: Deng, Haotian, et al.
Published: (2025)
by: Deng, Haotian, et al.
Published: (2025)
Coarse-to-Fine 3D Keyframe Transporter
by: Zhu, Xupeng, et al.
Published: (2025)
by: Zhu, Xupeng, et al.
Published: (2025)
Keyframe-Based Feed-Forward Visual Odometry
by: Dai, Weichen, et al.
Published: (2026)
by: Dai, Weichen, et al.
Published: (2026)
MANGO: Multimodal Attention-based Normalizing Flow Approach to Fusion Learning
by: Truong, Thanh-Dat, et al.
Published: (2025)
by: Truong, Thanh-Dat, et al.
Published: (2025)
CineVerse: Consistent Keyframe Synthesis for Cinematic Scene Composition
by: Phung, Quynh, et al.
Published: (2025)
by: Phung, Quynh, et al.
Published: (2025)
Threading Keyframe with Narratives: MLLMs as Strong Long Video Comprehenders
by: Fang, Bo, et al.
Published: (2025)
by: Fang, Bo, et al.
Published: (2025)
Less is More: Improving Motion Diffusion Models with Sparse Keyframes
by: Bae, Jinseok, et al.
Published: (2025)
by: Bae, Jinseok, et al.
Published: (2025)
Range-Agnostic Multi-View Depth Estimation With Keyframe Selection
by: Conti, Andrea, et al.
Published: (2024)
by: Conti, Andrea, et al.
Published: (2024)
Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation
by: Wang, Xiaojuan, et al.
Published: (2024)
by: Wang, Xiaojuan, et al.
Published: (2024)
KS-APR: Keyframe Selection for Robust Absolute Pose Regression
by: Liu, Changkun, et al.
Published: (2023)
by: Liu, Changkun, et al.
Published: (2023)
Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss
by: Kim, Jaeha, et al.
Published: (2024)
by: Kim, Jaeha, et al.
Published: (2024)
Generative Motion Infilling From Imprecisely Timed Keyframes
by: Goel, Purvi, et al.
Published: (2025)
by: Goel, Purvi, et al.
Published: (2025)
Text-guided 3D Human Motion Generation with Keyframe-based Parallel Skip Transformer
by: Geng, Zichen, et al.
Published: (2024)
by: Geng, Zichen, et al.
Published: (2024)
VTAgent: Agentic Keyframe Anchoring for Evidence-Aware Video TextVQA
by: He, Haibin, et al.
Published: (2026)
by: He, Haibin, et al.
Published: (2026)
Occlusion-Aware Physics-Semantic Keyframe Selection for Robust Video Editing
by: Liu, Lin, et al.
Published: (2026)
by: Liu, Lin, et al.
Published: (2026)
PRISM: Color-Stratified Point Cloud Sampling
by: Lim, Hansol, et al.
Published: (2026)
by: Lim, Hansol, et al.
Published: (2026)
MultiTSF: Transformer-based Sensor Fusion for Human-Centric Multi-view and Multi-modal Action Recognition
by: Nguyen, Trung Thanh, et al.
Published: (2025)
by: Nguyen, Trung Thanh, et al.
Published: (2025)
Decomposing Queries into Tool Calls for Long-Video Keyframe Retrieval
by: Shlapentokh-Rothman, Michal, et al.
Published: (2026)
by: Shlapentokh-Rothman, Michal, et al.
Published: (2026)
Similar Items
-
TriPSS: A Tri-Modal Keyframe Extraction Framework Using Perceptual, Structural, and Semantic Representations
by: Cakmak, Mert Can, et al.
Published: (2025) -
Investigating Algorithmic Bias in YouTube Shorts
by: Cakmak, Mert Can, et al.
Published: (2025) -
A Keyframe-Based Approach for Auditing Bias in YouTube Shorts Recommendations
by: Cakmak, Mert Can, et al.
Published: (2025) -
Large Model based Sequential Keyframe Extraction for Video Summarization
by: Tan, Kailong, et al.
Published: (2024) -
Controllable Human-centric Keyframe Interpolation with Generative Prior
by: Guo, Zujin, et al.
Published: (2025)