Guardado en:
| Autores principales: | Kang, Haeyong, Yoon, Jaehong, Kim, DaHyun, Hwang, Sung Ju, Yoo, Chang D |
|---|---|
| Formato: | Preprint |
| Publicado: |
2023
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2306.11305 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Continual Learning: Forget-free Winning Subnetworks for Video Representations
por: Kang, Haeyong, et al.
Publicado: (2023)
por: Kang, Haeyong, et al.
Publicado: (2023)
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
por: Hwang, Sunil, et al.
Publicado: (2022)
por: Hwang, Sunil, et al.
Publicado: (2022)
Soft-TransFormers for Continual Learning
por: Kang, Haeyong, et al.
Publicado: (2024)
por: Kang, Haeyong, et al.
Publicado: (2024)
STELLA: Continual Audio-Video Pre-training with Spatio-Temporal Localized Alignment
por: Lee, Jaewoo, et al.
Publicado: (2023)
por: Lee, Jaewoo, et al.
Publicado: (2023)
BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
por: Lee, Daeun, et al.
Publicado: (2024)
por: Lee, Daeun, et al.
Publicado: (2024)
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning
por: Yeo, Woongyeong, et al.
Publicado: (2025)
por: Yeo, Woongyeong, et al.
Publicado: (2025)
Self-Refining Video Sampling
por: Jang, Sangwon, et al.
Publicado: (2026)
por: Jang, Sangwon, et al.
Publicado: (2026)
Frame Guidance: Training-Free Guidance for Frame-Level Control in Video Diffusion Models
por: Jang, Sangwon, et al.
Publicado: (2025)
por: Jang, Sangwon, et al.
Publicado: (2025)
Semantic-Aware Reconstruction Error for Detecting AI-Generated Images
por: Kang, Ju Yeon, et al.
Publicado: (2025)
por: Kang, Ju Yeon, et al.
Publicado: (2025)
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation
por: Ki, Taekyung, et al.
Publicado: (2026)
por: Ki, Taekyung, et al.
Publicado: (2026)
FRAG: Frequency Adapting Group for Diffusion Video Editing
por: Yoon, Sunjae, et al.
Publicado: (2024)
por: Yoon, Sunjae, et al.
Publicado: (2024)
SCANet: Scene Complexity Aware Network for Weakly-Supervised Video Moment Retrieval
por: Yoon, Sunjae, et al.
Publicado: (2023)
por: Yoon, Sunjae, et al.
Publicado: (2023)
Improving Neural Radiance Field using Near-Surface Sampling with Point Cloud Generation
por: Yoo, Hye Bin, et al.
Publicado: (2023)
por: Yoo, Hye Bin, et al.
Publicado: (2023)
Are Video Reasoning Models Ready to Go Outside?
por: He, Yangfan, et al.
Publicado: (2026)
por: He, Yangfan, et al.
Publicado: (2026)
Selective Query-guided Debiasing for Video Corpus Moment Retrieval
por: Yoon, Sunjae, et al.
Publicado: (2022)
por: Yoon, Sunjae, et al.
Publicado: (2022)
Rethinking Saliency-Guided Weakly-Supervised Semantic Segmentation
por: Kim, Beomyoung, et al.
Publicado: (2024)
por: Kim, Beomyoung, et al.
Publicado: (2024)
Multimodal Representation Learning by Alternating Unimodal Adaptation
por: Zhang, Xiaohui, et al.
Publicado: (2023)
por: Zhang, Xiaohui, et al.
Publicado: (2023)
Towards Label-Efficient Human Matting: A Simple Baseline for Weakly Semi-Supervised Trimap-Free Human Matting
por: Kim, Beomyoung, et al.
Publicado: (2024)
por: Kim, Beomyoung, et al.
Publicado: (2024)
ECLIPSE: Efficient Continual Learning in Panoptic Segmentation with Visual Prompt Tuning
por: Kim, Beomyoung, et al.
Publicado: (2024)
por: Kim, Beomyoung, et al.
Publicado: (2024)
VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding
por: Kim, Kangsan, et al.
Publicado: (2024)
por: Kim, Kangsan, et al.
Publicado: (2024)
CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion
por: Yu, Shoubin, et al.
Publicado: (2024)
por: Yu, Shoubin, et al.
Publicado: (2024)
RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives
por: Yoon, Jaehong, et al.
Publicado: (2024)
por: Yoon, Jaehong, et al.
Publicado: (2024)
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models
por: Sung, Yi-Lin, et al.
Publicado: (2023)
por: Sung, Yi-Lin, et al.
Publicado: (2023)
DNI: Dilutional Noise Initialization for Diffusion Video Editing
por: Yoon, Sunjae, et al.
Publicado: (2024)
por: Yoon, Sunjae, et al.
Publicado: (2024)
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos
por: Wang, Ziyang, et al.
Publicado: (2024)
por: Wang, Ziyang, et al.
Publicado: (2024)
Can Video LLMs Refuse to Answer? Alignment for Answerability in Video Large Language Models
por: Yoon, Eunseop, et al.
Publicado: (2025)
por: Yoon, Eunseop, et al.
Publicado: (2025)
Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning
por: Lee, Daeun, et al.
Publicado: (2025)
por: Lee, Daeun, et al.
Publicado: (2025)
ViKey: Enhancing Temporal Understanding in Videos via Visual Prompting
por: Lee, Yeonkyung, et al.
Publicado: (2026)
por: Lee, Yeonkyung, et al.
Publicado: (2026)
Neural Pose Representation Learning for Generating and Transferring Non-Rigid Object Poses
por: Yoo, Seungwoo, et al.
Publicado: (2024)
por: Yoo, Seungwoo, et al.
Publicado: (2024)
UCMNet: Uncertainty-Aware Context Memory Network for Under-Display Camera Image Restoration
por: Kim, Daehyun, et al.
Publicado: (2026)
por: Kim, Daehyun, et al.
Publicado: (2026)
Semantic Watermarking Reinvented: Enhancing Robustness and Generation Quality with Fourier Integrity
por: Lee, Sung Ju, et al.
Publicado: (2025)
por: Lee, Sung Ju, et al.
Publicado: (2025)
Self-Correcting Text-to-Video Generation with Misalignment Detection and Localized Refinement
por: Lee, Daeun, et al.
Publicado: (2024)
por: Lee, Daeun, et al.
Publicado: (2024)
VideoRAG: Retrieval-Augmented Generation over Video Corpus
por: Jeong, Soyeong, et al.
Publicado: (2025)
por: Jeong, Soyeong, et al.
Publicado: (2025)
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
por: Kim, Jaihoon, et al.
Publicado: (2025)
por: Kim, Jaihoon, et al.
Publicado: (2025)
Fourier Decomposition for Explicit Representation of 3D Point Cloud Attributes
por: Kim, Donghyun, et al.
Publicado: (2025)
por: Kim, Donghyun, et al.
Publicado: (2025)
Wavelet-Guided Acceleration of Text Inversion in Diffusion-Based Image Editing
por: Koo, Gwanhyeong, et al.
Publicado: (2024)
por: Koo, Gwanhyeong, et al.
Publicado: (2024)
SNeRV: Spectra-preserving Neural Representation for Video
por: Kim, Jina, et al.
Publicado: (2025)
por: Kim, Jina, et al.
Publicado: (2025)
MA-EgoQA: Question Answering over Egocentric Videos from Multiple Embodied Agents
por: Kim, Kangsan, et al.
Publicado: (2026)
por: Kim, Kangsan, et al.
Publicado: (2026)
BF-STVSR: B-Splines and Fourier-Best Friends for High Fidelity Spatial-Temporal Video Super-Resolution
por: Kim, Eunjin, et al.
Publicado: (2025)
por: Kim, Eunjin, et al.
Publicado: (2025)
MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset
por: Sung-Bin, Kim, et al.
Publicado: (2024)
por: Sung-Bin, Kim, et al.
Publicado: (2024)
Ejemplares similares
-
Continual Learning: Forget-free Winning Subnetworks for Video Representations
por: Kang, Haeyong, et al.
Publicado: (2023) -
EVEREST: Efficient Masked Video Autoencoder by Removing Redundant Spatiotemporal Tokens
por: Hwang, Sunil, et al.
Publicado: (2022) -
Soft-TransFormers for Continual Learning
por: Kang, Haeyong, et al.
Publicado: (2024) -
STELLA: Continual Audio-Video Pre-training with Spatio-Temporal Localized Alignment
por: Lee, Jaewoo, et al.
Publicado: (2023) -
BECoTTA: Input-dependent Online Blending of Experts for Continual Test-time Adaptation
por: Lee, Daeun, et al.
Publicado: (2024)