Saved in:
| Main Authors: | Kim, Chaeyun, Yi, Seunghoon, Kim, Yejin, Jo, Yohan, Lee, Joonseok |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.17413 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation
by: Ha, Seongsu, et al.
Published: (2024)
by: Ha, Seongsu, et al.
Published: (2024)
Geometry-Aware Image Flow Matching
by: Lee, Junho, et al.
Published: (2026)
by: Lee, Junho, et al.
Published: (2026)
Latent Expression Generation for Referring Image Segmentation and Grounding
by: Yu, Seonghoon, et al.
Published: (2025)
by: Yu, Seonghoon, et al.
Published: (2025)
Is There a Better Source Distribution than Gaussian? Exploring Source Distributions for Image Flow Matching
by: Lee, Junho, et al.
Published: (2025)
by: Lee, Junho, et al.
Published: (2025)
Towards Scalable Human-aligned Benchmark for Text-guided Image Editing
by: Ryu, Suho, et al.
Published: (2025)
by: Ryu, Suho, et al.
Published: (2025)
Informative Object-centric Next Best View for Object-aware 3D Gaussian Splatting in Cluttered Scenes
by: Jeong, Seunghoon, et al.
Published: (2026)
by: Jeong, Seunghoon, et al.
Published: (2026)
InterRVOS: Interaction-aware Referring Video Object Segmentation
by: Jin, Woojeong, et al.
Published: (2025)
by: Jin, Woojeong, et al.
Published: (2025)
Modality-Aware Representation Learning for Zero-shot Sketch-based Image Retrieval
by: Lyou, Eunyi, et al.
Published: (2024)
by: Lyou, Eunyi, et al.
Published: (2024)
A More Word-like Image Tokenization for MLLMs
by: Lee, Hyun, et al.
Published: (2026)
by: Lee, Hyun, et al.
Published: (2026)
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
by: Cho, Suhwan, et al.
Published: (2025)
by: Cho, Suhwan, et al.
Published: (2025)
TripleSumm: Adaptive Triple-Modality Fusion for Video Summarization
by: Kim, Sumin, et al.
Published: (2026)
by: Kim, Sumin, et al.
Published: (2026)
OVS Meets Continual Learning: Towards Sustainable Open-Vocabulary Segmentation
by: Hwang, Dongjun, et al.
Published: (2024)
by: Hwang, Dongjun, et al.
Published: (2024)
Bridging the Missing-Modality Gap: Improving Text-Only Calibration of Vision Language Models
by: Kim, Mingyeong, et al.
Published: (2026)
by: Kim, Mingyeong, et al.
Published: (2026)
Improving Unsupervised Video Object Segmentation via Fake Flow Generation
by: Cho, Suhwan, et al.
Published: (2024)
by: Cho, Suhwan, et al.
Published: (2024)
Dual Prototype Attention for Unsupervised Video Object Segmentation
by: Cho, Suhwan, et al.
Published: (2022)
by: Cho, Suhwan, et al.
Published: (2022)
Extending CLIP's Image-Text Alignment to Referring Image Segmentation
by: Kim, Seoyeon, et al.
Published: (2023)
by: Kim, Seoyeon, et al.
Published: (2023)
Can MLLMs Reason About Visual Persuasion? Evaluating the Efficacy and Faithfulness of Reasoning
by: Lee, Naeun, et al.
Published: (2026)
by: Lee, Naeun, et al.
Published: (2026)
Isometric Representation Learning for Disentangled Latent Space of Diffusion Models
by: Hahm, Jaehoon, et al.
Published: (2024)
by: Hahm, Jaehoon, et al.
Published: (2024)
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
by: Lee, Minhyun, et al.
Published: (2024)
by: Lee, Minhyun, et al.
Published: (2024)
ATTIQA: Generalizable Image Quality Feature Extractor using Attribute-aware Pretraining
by: Kwon, Daekyu, et al.
Published: (2024)
by: Kwon, Daekyu, et al.
Published: (2024)
MetaWeather: Few-Shot Weather-Degraded Image Restoration
by: Kim, Youngrae, et al.
Published: (2023)
by: Kim, Youngrae, et al.
Published: (2023)
Zero-Shot Industrial Anomaly Segmentation with Image-Aware Prompt Generation
by: Park, SoYoung, et al.
Published: (2025)
by: Park, SoYoung, et al.
Published: (2025)
THE-Pose: Topological Prior with Hybrid Graph Fusion for Estimating Category-Level 6D Object Pose
by: Lee, Eunho, et al.
Published: (2025)
by: Lee, Eunho, et al.
Published: (2025)
Equivariant Latent Alignment via Flow Matching under Group Symmetries
by: Kim, Sunghyun, et al.
Published: (2026)
by: Kim, Sunghyun, et al.
Published: (2026)
Easy to Learn, Yet Hard to Forget: Towards Robust Unlearning Under Bias
by: Kwon, JuneHyoung, et al.
Published: (2026)
by: Kwon, JuneHyoung, et al.
Published: (2026)
Tsanet: Temporal and Scale Alignment for Unsupervised Video Object Segmentation
by: Lee, Seunghoon, et al.
Published: (2023)
by: Lee, Seunghoon, et al.
Published: (2023)
EVIDENT: Routing MLLM Adaptation through Entity-Grounded Visual Evidence for Cross-Domain Video Temporal Grounding
by: Ahn, Geo, et al.
Published: (2026)
by: Ahn, Geo, et al.
Published: (2026)
Bridging the gap to real-world language-grounded visual concept learning
by: Jung, Whie, et al.
Published: (2025)
by: Jung, Whie, et al.
Published: (2025)
TALENT: Target-aware Efficient Tuning for Referring Image Segmentation
by: Jin, Shuo, et al.
Published: (2026)
by: Jin, Shuo, et al.
Published: (2026)
GaussianVideo: Efficient Video Representation and Compression by Gaussian Splatting
by: Lee, Inseo, et al.
Published: (2025)
by: Lee, Inseo, et al.
Published: (2025)
Sparse-DeRF: Deblurred Neural Radiance Fields from Sparse View
by: Lee, Dogyoon, et al.
Published: (2024)
by: Lee, Dogyoon, et al.
Published: (2024)
Universal Few-Shot Spatial Control for Diffusion Models
by: Nguyen, Kiet T., et al.
Published: (2025)
by: Nguyen, Kiet T., et al.
Published: (2025)
DIAMOND: An LLM-Driven Agent for Context-Aware Baseball Highlight Summarization
by: Kang, Jeonghun, et al.
Published: (2025)
by: Kang, Jeonghun, et al.
Published: (2025)
Feature Augmentation based Test-Time Adaptation
by: Cho, Younggeol, et al.
Published: (2024)
by: Cho, Younggeol, et al.
Published: (2024)
A Simple Baseline with Single-encoder for Referring Image Segmentation
by: Yu, Seonghoon, et al.
Published: (2024)
by: Yu, Seonghoon, et al.
Published: (2024)
CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred Images
by: Lee, Jungho, et al.
Published: (2025)
by: Lee, Jungho, et al.
Published: (2025)
Multi-Context Temporal Consistent Modeling for Referring Video Object Segmentation
by: Choi, Sun-Hyuk, et al.
Published: (2025)
by: Choi, Sun-Hyuk, et al.
Published: (2025)
HyperFlow: Gradient-Free Emulation of Few-Shot Fine-Tuning
by: Kim, Donggyun, et al.
Published: (2025)
by: Kim, Donggyun, et al.
Published: (2025)
Chameleon: A Data-Efficient Generalist for Dense Visual Prediction in the Wild
by: Kim, Donggyun, et al.
Published: (2024)
by: Kim, Donggyun, et al.
Published: (2024)
CMTM: Cross-Modal Token Modulation for Unsupervised Video Object Segmentation
by: Jeon, Inseok, et al.
Published: (2026)
by: Jeon, Inseok, et al.
Published: (2026)
Similar Items
-
Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation
by: Ha, Seongsu, et al.
Published: (2024) -
Geometry-Aware Image Flow Matching
by: Lee, Junho, et al.
Published: (2026) -
Latent Expression Generation for Referring Image Segmentation and Grounding
by: Yu, Seonghoon, et al.
Published: (2025) -
Is There a Better Source Distribution than Gaussian? Exploring Source Distributions for Image Flow Matching
by: Lee, Junho, et al.
Published: (2025) -
Towards Scalable Human-aligned Benchmark for Text-guided Image Editing
by: Ryu, Suho, et al.
Published: (2025)