:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kim, Chaeyun, Yi, Seunghoon, Kim, Yejin, Jo, Yohan, Lee, Joonseok
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2603.17413
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation
by: Ha, Seongsu, et al.
Published: (2024)

Geometry-Aware Image Flow Matching
by: Lee, Junho, et al.
Published: (2026)

Latent Expression Generation for Referring Image Segmentation and Grounding
by: Yu, Seonghoon, et al.
Published: (2025)

Is There a Better Source Distribution than Gaussian? Exploring Source Distributions for Image Flow Matching
by: Lee, Junho, et al.
Published: (2025)

Towards Scalable Human-aligned Benchmark for Text-guided Image Editing
by: Ryu, Suho, et al.
Published: (2025)

Informative Object-centric Next Best View for Object-aware 3D Gaussian Splatting in Cluttered Scenes
by: Jeong, Seunghoon, et al.
Published: (2026)

InterRVOS: Interaction-aware Referring Video Object Segmentation
by: Jin, Woojeong, et al.
Published: (2025)

Modality-Aware Representation Learning for Zero-shot Sketch-based Image Retrieval
by: Lyou, Eunyi, et al.
Published: (2024)

A More Word-like Image Tokenization for MLLMs
by: Lee, Hyun, et al.
Published: (2026)

Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
by: Cho, Suhwan, et al.
Published: (2025)

TripleSumm: Adaptive Triple-Modality Fusion for Video Summarization
by: Kim, Sumin, et al.
Published: (2026)

OVS Meets Continual Learning: Towards Sustainable Open-Vocabulary Segmentation
by: Hwang, Dongjun, et al.
Published: (2024)

Bridging the Missing-Modality Gap: Improving Text-Only Calibration of Vision Language Models
by: Kim, Mingyeong, et al.
Published: (2026)

Improving Unsupervised Video Object Segmentation via Fake Flow Generation
by: Cho, Suhwan, et al.
Published: (2024)

Dual Prototype Attention for Unsupervised Video Object Segmentation
by: Cho, Suhwan, et al.
Published: (2022)

Extending CLIP's Image-Text Alignment to Referring Image Segmentation
by: Kim, Seoyeon, et al.
Published: (2023)

Can MLLMs Reason About Visual Persuasion? Evaluating the Efficacy and Faithfulness of Reasoning
by: Lee, Naeun, et al.
Published: (2026)

Isometric Representation Learning for Disentangled Latent Space of Diffusion Models
by: Hahm, Jaehoon, et al.
Published: (2024)

MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
by: Lee, Minhyun, et al.
Published: (2024)

ATTIQA: Generalizable Image Quality Feature Extractor using Attribute-aware Pretraining
by: Kwon, Daekyu, et al.
Published: (2024)

MetaWeather: Few-Shot Weather-Degraded Image Restoration
by: Kim, Youngrae, et al.
Published: (2023)

Zero-Shot Industrial Anomaly Segmentation with Image-Aware Prompt Generation
by: Park, SoYoung, et al.
Published: (2025)

THE-Pose: Topological Prior with Hybrid Graph Fusion for Estimating Category-Level 6D Object Pose
by: Lee, Eunho, et al.
Published: (2025)

Equivariant Latent Alignment via Flow Matching under Group Symmetries
by: Kim, Sunghyun, et al.
Published: (2026)

Easy to Learn, Yet Hard to Forget: Towards Robust Unlearning Under Bias
by: Kwon, JuneHyoung, et al.
Published: (2026)

Tsanet: Temporal and Scale Alignment for Unsupervised Video Object Segmentation
by: Lee, Seunghoon, et al.
Published: (2023)

EVIDENT: Routing MLLM Adaptation through Entity-Grounded Visual Evidence for Cross-Domain Video Temporal Grounding
by: Ahn, Geo, et al.
Published: (2026)

Bridging the gap to real-world language-grounded visual concept learning
by: Jung, Whie, et al.
Published: (2025)

TALENT: Target-aware Efficient Tuning for Referring Image Segmentation
by: Jin, Shuo, et al.
Published: (2026)

GaussianVideo: Efficient Video Representation and Compression by Gaussian Splatting
by: Lee, Inseo, et al.
Published: (2025)

Sparse-DeRF: Deblurred Neural Radiance Fields from Sparse View
by: Lee, Dogyoon, et al.
Published: (2024)

Universal Few-Shot Spatial Control for Diffusion Models
by: Nguyen, Kiet T., et al.
Published: (2025)

DIAMOND: An LLM-Driven Agent for Context-Aware Baseball Highlight Summarization
by: Kang, Jeonghun, et al.
Published: (2025)

Feature Augmentation based Test-Time Adaptation
by: Cho, Younggeol, et al.
Published: (2024)

A Simple Baseline with Single-encoder for Referring Image Segmentation
by: Yu, Seonghoon, et al.
Published: (2024)

CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred Images
by: Lee, Jungho, et al.
Published: (2025)

Multi-Context Temporal Consistent Modeling for Referring Video Object Segmentation
by: Choi, Sun-Hyuk, et al.
Published: (2025)

HyperFlow: Gradient-Free Emulation of Few-Shot Fine-Tuning
by: Kim, Donggyun, et al.
Published: (2025)

Chameleon: A Data-Efficient Generalist for Dense Visual Prediction in the Wild
by: Kim, Donggyun, et al.
Published: (2024)

CMTM: Cross-Modal Token Modulation for Unsupervised Video Object Segmentation
by: Jeon, Inseok, et al.
Published: (2026)