:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Park, Jihun, Gim, Jongmin, Lee, Kyoungmin, Lee, Seunghun, Im, Sunghoon
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2408.08461
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Training-Free Style-Personalization via SVD-Based Feature Decomposition
by: Lee, Kyoungmin, et al.
Published: (2025)

A Training-Free Style-aligned Image Generation with Scale-wise Autoregressive Model
by: Park, Jihun, et al.
Published: (2025)

Infinite-Story: A Training-Free Consistent Text-to-Image Generation
by: Park, Jihun, et al.
Published: (2025)

Bridging Geometric and Semantic Foundation Models for Generalized Monocular Depth Estimation
by: Ma, Sanggyun, et al.
Published: (2025)

CAVIS: Context-Aware Video Instance Segmentation
by: Lee, Seunghun, et al.
Published: (2024)

CVA: Context-aware Video-text Alignment for Video Temporal Grounding
by: Moon, Sungho, et al.
Published: (2026)

Latest Object Memory Management for Temporally Consistent Video Instance Segmentation
by: Lee, Seunghun, et al.
Published: (2025)

Multi-task Learning for Real-time Autonomous Driving Leveraging Task-adaptive Attention Generator
by: Choi, Wonhyeok, et al.
Published: (2024)

Implicit Neural Image Stitching
by: Kim, Minsu, et al.
Published: (2023)

Temporal Grounding as a Learning Signal for Referring Video Object Segmentation
by: Lee, Seunghun, et al.
Published: (2025)

Depth-discriminative Metric Learning for Monocular 3D Object Detection
by: Choi, Wonhyeok, et al.
Published: (2024)

Fashion Style Editing with Generative Human Prior
by: Kong, Chaerin, et al.
Published: (2024)

Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach
by: Lee, Saehyung, et al.
Published: (2024)

Towards Lossless Implicit Neural Representation via Bit Plane Decomposition
by: Han, Woo Kyoung, et al.
Published: (2025)

JPEG Processing Neural Operator for Backward-Compatible Coding
by: Han, Woo Kyoung, et al.
Published: (2025)

Segment Any Events with Language
by: Lee, Seungjun, et al.
Published: (2026)

Rethinking LiDAR Domain Generalization: Single Source as Multiple Density Domains
by: Kim, Jaeyeul, et al.
Published: (2023)

COCOTree: A Dataset and Benchmark for Open Tree-Structured Visual Decomposition
by: Lee, Junhyub, et al.
Published: (2026)

COMPASS: High-Efficiency Deep Image Compression with Arbitrary-scale Spatial Scalability
by: Park, Jongmin, et al.
Published: (2023)

DOGS: Distributed-Oriented Gaussian Splatting for Large-Scale 3D Reconstruction Via Gaussian Consensus
by: Chen, Yu, et al.
Published: (2024)

econSG: Efficient and Multi-view Consistent Open-Vocabulary 3D Semantic Gaussians
by: Zhang, Can, et al.
Published: (2025)

Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding
by: Zhou, Haoran, et al.
Published: (2025)

LLaFEA: Frame-Event Complementary Fusion for Fine-Grained Spatiotemporal Understanding in LMMs
by: Zhou, Hanyu, et al.
Published: (2025)

LLaVA-4D: Embedding SpatioTemporal Prompt into LMMs for 4D Scene Understanding
by: Zhou, Hanyu, et al.
Published: (2025)

IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments
by: Zhang, Can, et al.
Published: (2025)

MotionScale: Reconstructing Appearance, Geometry, and Motion of Dynamic Scenes with Scalable 4D Gaussian Splatting
by: Zhou, Haoran, et al.
Published: (2026)

Unified Geometry and Color Compression Framework for Point Clouds via Generative Diffusion Priors
by: Huang, Tianxin, et al.
Published: (2025)

Flow4DGS-SLAM: Optical Flow-Guided 4D Gaussian Splatting SLAM
by: Wang, Yunsong, et al.
Published: (2026)

HandMCM: Multi-modal Point Cloud-based Correspondence State Space Model for 3D Hand Pose Estimation
by: Cheng, Wencan, et al.
Published: (2026)

Uni4D-LLM: A Unified SpatioTemporal-Aware VLM for 4D Understanding and Generation
by: Zhou, Hanyu, et al.
Published: (2025)

SCAPO: Self-Supervised Category-Level Articulated Pose Estimation from a Single 3D Observation
by: Zhang, Can, et al.
Published: (2026)

BurstM: Deep Burst Multi-scale SR using Fourier Space with Optical Flow
by: Kang, EungGu, et al.
Published: (2024)

Sequential Flow Straightening for Generative Modeling
by: Yoon, Jongmin, et al.
Published: (2024)

Flow4D: Leveraging 4D Voxel Network for LiDAR Scene Flow Estimation
by: Kim, Jaeyeul, et al.
Published: (2024)

Attention Frequency Modulation: Training-Free Spectral Modulation of Diffusion Cross-Attention
by: Oh, Seunghun, et al.
Published: (2026)

MV-RoMa: From Pairwise Matching into Multi-View Track Reconstruction
by: Lee, Jongmin, et al.
Published: (2026)

DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting
by: Lee, Seungjun, et al.
Published: (2025)

DiSR-NeRF: Diffusion-Guided View-Consistent Super-Resolution NeRF
by: Lee, Jie Long, et al.
Published: (2024)

Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation
by: Wang, Zihan, et al.
Published: (2025)

Neural USD: An object-centric framework for iterative editing and control
by: Escontrela, Alejandro, et al.
Published: (2025)