:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Park, Keunwoo, Chae, Jihye, Ahn, Joong Ho, Kweon, Jihoon
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2511.10993
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Volumetric Conditioning Module to Control Pretrained Diffusion Models for 3D Medical Images
by: Ahn, Suhyun, et al.
Published: (2024)

CLUE: Leveraging Low-Rank Adaptation to Capture Latent Uncovered Evidence for Image Forgery Localization
by: Wang, Youqi, et al.
Published: (2025)

DAFT-GAN: Dual Affine Transformation Generative Adversarial Network for Text-Guided Image Inpainting
by: Lee, Jihoon, et al.
Published: (2024)

Two-Stage Approach for Brain MR Image Synthesis: 2D Image Synthesis and 3D Refinement
by: Cho, Jihoon, et al.
Published: (2024)

DiverseDream: Diverse Text-to-3D Synthesis with Augmented Text Embedding
by: Tran, Uy Dieu, et al.
Published: (2023)

MaDis-Stereo: Enhanced Stereo Matching via Distilled Masked Image Modeling
by: Ahn, Jihye, et al.
Published: (2024)

OceanSplat: Object-aware Gaussian Splatting with Trinocular View Consistency for Underwater Scene Reconstruction
by: Kweon, Minseong, et al.
Published: (2026)

360 in the Wild: Dataset for Depth Prediction and View Synthesis
by: Park, Kibaek, et al.
Published: (2024)

Reconciling Semantic Controllability and Diversity for Remote Sensing Image Synthesis with Hybrid Semantic Embedding
by: Liu, Junde, et al.
Published: (2024)

InstructBooth: Instruction-following Personalized Text-to-Image Generation
by: Chae, Daewon, et al.
Published: (2023)

DECOR:Decomposition and Projection of Text Embeddings for Text-to-Image Customization
by: Jang, Geonhui, et al.
Published: (2024)

MrGS: Multi-modal Radiance Fields with 3D Gaussian Splatting for RGB-Thermal Novel View Synthesis
by: Kweon, Minseong, et al.
Published: (2025)

End-to-end Training for Text-to-Image Synthesis using Dual-Text Embeddings
by: Ahmed, Yeruru Asrar, et al.
Published: (2025)

FIQ: Fundamental Question Generation with the Integration of Question Embeddings for Video Question Answering
by: Oh, Ju-Young, et al.
Published: (2025)

UniTT-Stereo: Unified Training of Transformer for Enhanced Stereo Matching
by: Kim, Soomin, et al.
Published: (2024)

Unified Diffusion Transformer for High-fidelity Text-Aware Image Restoration
by: Kim, Jin Hyeon, et al.
Published: (2025)

Principled Feature Disentanglement for High-Fidelity Unified Brain MRI Synthesis
by: Cho, Jihoon, et al.
Published: (2024)

Learning to Sample Effective and Diverse Prompts for Text-to-Image Generation
by: Yun, Taeyoung, et al.
Published: (2025)

MonoCLUE : Object-Aware Clustering Enhances Monocular 3D Object Detection
by: Yang, Sunghun, et al.
Published: (2025)

Temporally-Grounded Language Generation: A Benchmark for Real-Time Vision-Language Models
by: Yu, Keunwoo Peter, et al.
Published: (2025)

Text-Aware Image Restoration with Diffusion Models
by: Min, Jaewon, et al.
Published: (2025)

Phantom of Latent for Large Language and Vision Models
by: Lee, Byung-Kwan, et al.
Published: (2024)

Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization
by: Kim, Jimyeong, et al.
Published: (2024)

DiffSLT: Enhancing Diversity in Sign Language Translation via Diffusion Model
by: Moon, JiHwan, et al.
Published: (2024)

Harmonizing Visual and Textual Embeddings for Zero-Shot Text-to-Image Customization
by: Song, Yeji, et al.
Published: (2024)

Naïve PAINE: Lightweight Text-to-Image Generation Improvement with Prompt Evaluation
by: Kim, Joong Ho, et al.
Published: (2026)

DOS: Directional Object Separation in Text Embeddings for Multi-Object Image Generation
by: Byun, Dongnam, et al.
Published: (2025)

A Self-Supervised Approach on Motion Calibration for Enhancing Physical Plausibility in Text-to-Motion
by: Shim, Gahyeon, et al.
Published: (2026)

Semantic Image Synthesis with Unconditional Generator
by: Chae, Jungwoo, et al.
Published: (2024)

SpineCLUE: Automatic Vertebrae Identification Using Contrastive Learning and Uncertainty Estimation
by: Zhang, Sheng, et al.
Published: (2024)

Clustering-based Image-Text Graph Matching for Domain Generalization
by: Park, Nokyung, et al.
Published: (2023)

TinySeg: Model Optimizing Framework for Image Segmentation on Tiny Embedded Systems
by: Chae, Byungchul, et al.
Published: (2024)

Prompt-Softbox-Prompt: A Free-Text Embedding Control for Image Editing
by: Yang, Yitong, et al.
Published: (2024)

Dual Recursive Feedback on Generation and Appearance Latents for Pose-Robust Text-to-Image Diffusion
by: Kim, Jiwon, et al.
Published: (2025)

OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction
by: Li, Leheng, et al.
Published: (2024)

SpatialLock: Precise Spatial Control in Text-to-Image Synthesis
by: Liu, Biao, et al.
Published: (2025)

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis
by: Zhou, Dewei, et al.
Published: (2024)

Editable Image Elements for Controllable Synthesis
by: Mu, Jiteng, et al.
Published: (2024)

Espresso: High Compression For Rich Extraction From Videos for Your Vision-Language Model
by: Yu, Keunwoo Peter, et al.
Published: (2024)

RGB2Point: 3D Point Cloud Generation from Single RGB Images
by: Lee, Jae Joong, et al.
Published: (2024)