Saved in:
| Main Authors: | Cheng, Jiaxin, Zhao, Zixu, He, Tong, Xiao, Tianjun, Zhou, Yicong, Zhang, Zheng |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.04847 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
VideoSAM: Open-World Video Segmentation
by: Guo, Pinxue, et al.
Published: (2024)
by: Guo, Pinxue, et al.
Published: (2024)
MEMO: Human-like Crisp Edge Detection Using Masked Edge Prediction
by: Cheng, Jiaxin, et al.
Published: (2026)
by: Cheng, Jiaxin, et al.
Published: (2026)
Towards Generalized Multimodal Homography Estimation
by: You, Jinkun, et al.
Published: (2026)
by: You, Jinkun, et al.
Published: (2026)
Learning from Mistakes: Iterative Prompt Relabeling for Text-to-Image Diffusion Model Training
by: Chen, Xinyan, et al.
Published: (2023)
by: Chen, Xinyan, et al.
Published: (2023)
Rethinking Training Dynamics in Scale-wise Autoregressive Generation
by: Zhou, Gengze, et al.
Published: (2025)
by: Zhou, Gengze, et al.
Published: (2025)
Layout-Guided Controllable Pathology Image Generation with In-Context Diffusion Transformers
by: Shou, Yuntao, et al.
Published: (2026)
by: Shou, Yuntao, et al.
Published: (2026)
LoCo: Locally Constrained Training-Free Layout-to-Image Synthesis
by: Zhao, Peiang, et al.
Published: (2023)
by: Zhao, Peiang, et al.
Published: (2023)
Quaternion Infrared Visible Image Fusion
by: Yang, Weihua, et al.
Published: (2025)
by: Yang, Weihua, et al.
Published: (2025)
Quaternion Sparse Decomposition for Multi-focus Color Image Fusion
by: Yang, Weihua, et al.
Published: (2025)
by: Yang, Weihua, et al.
Published: (2025)
SpotActor: Training-Free Layout-Controlled Consistent Image Generation
by: Wang, Jiahao, et al.
Published: (2024)
by: Wang, Jiahao, et al.
Published: (2024)
Training-Free Layout-to-Image Generation with Marginal Attention Constraints
by: Chen, Huancheng, et al.
Published: (2024)
by: Chen, Huancheng, et al.
Published: (2024)
LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation
by: Zheng, Guangcong, et al.
Published: (2023)
by: Zheng, Guangcong, et al.
Published: (2023)
Hallucination of Multimodal Large Language Models: A Survey
by: Bai, Zechen, et al.
Published: (2024)
by: Bai, Zechen, et al.
Published: (2024)
COHO: Context-Sensitive City-Scale Hierarchical Urban Layout Generation
by: He, Liu, et al.
Published: (2024)
by: He, Liu, et al.
Published: (2024)
Manga Generation via Layout-controllable Diffusion
by: Chen, Siyu, et al.
Published: (2024)
by: Chen, Siyu, et al.
Published: (2024)
ToLo: A Two-Stage, Training-Free Layout-To-Image Generation Framework For High-Overlap Layouts
by: Huang, Linhao, et al.
Published: (2025)
by: Huang, Linhao, et al.
Published: (2025)
Layout-Conditioned Autoregressive Text-to-Image Generation via Structured Masking
by: Zheng, Zirui, et al.
Published: (2025)
by: Zheng, Zirui, et al.
Published: (2025)
Automatic Layout Planning for Visually-Rich Documents with Instruction-Following Models
by: Zhu, Wanrong, et al.
Published: (2024)
by: Zhu, Wanrong, et al.
Published: (2024)
Unsupervised Open-Vocabulary Object Localization in Videos
by: Fan, Ke, et al.
Published: (2023)
by: Fan, Ke, et al.
Published: (2023)
RichControl: Structure- and Appearance-Rich Training-Free Spatial Control for Text-to-Image Generation
by: Pang, Lexi, et al.
Published: (2025)
by: Pang, Lexi, et al.
Published: (2025)
EchoGen: Cycle-Consistent Learning for Unified Layout-Image Generation and Understanding
by: Zou, Kai, et al.
Published: (2026)
by: Zou, Kai, et al.
Published: (2026)
CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation
by: Zhang, Hui, et al.
Published: (2024)
by: Zhang, Hui, et al.
Published: (2024)
ContextGen: Contextual Layout Anchoring for Identity-Consistent Multi-Instance Generation
by: Xu, Ruihang, et al.
Published: (2025)
by: Xu, Ruihang, et al.
Published: (2025)
Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
by: Gong, Biao, et al.
Published: (2023)
by: Gong, Biao, et al.
Published: (2023)
LTSim: Layout Transportation-based Similarity Measure for Evaluating Layout Generation
by: Otani, Mayu, et al.
Published: (2024)
by: Otani, Mayu, et al.
Published: (2024)
Training-free Composite Scene Generation for Layout-to-Image Synthesis
by: Liu, Jiaqi, et al.
Published: (2024)
by: Liu, Jiaqi, et al.
Published: (2024)
REAR: Rethinking Visual Autoregressive Models via Generator-Tokenizer Consistency Regularization
by: He, Qiyuan, et al.
Published: (2025)
by: He, Qiyuan, et al.
Published: (2025)
Guidance Matters: Rethinking the Evaluation Pitfall for Text-to-Image Generation
by: Xie, Dian, et al.
Published: (2026)
by: Xie, Dian, et al.
Published: (2026)
INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation
by: Hu, Jian, et al.
Published: (2025)
by: Hu, Jian, et al.
Published: (2025)
Dense Cross-Scale Image Alignment With Fully Spatial Correlation and Just Noticeable Difference Guidance
by: You, Jinkun, et al.
Published: (2025)
by: You, Jinkun, et al.
Published: (2025)
DCT-Mamba3D: Spectral Decorrelation and Spatial-Spectral Feature Extraction for Hyperspectral Image Classification
by: Cao, Weijia, et al.
Published: (2025)
by: Cao, Weijia, et al.
Published: (2025)
Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation
by: Zhang, Xiaoyu, et al.
Published: (2024)
by: Zhang, Xiaoyu, et al.
Published: (2024)
Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach
by: Bai, Zechen, et al.
Published: (2024)
by: Bai, Zechen, et al.
Published: (2024)
Cross-Domain Document Layout Analysis Using Document Style Guide
by: Wu, Xingjiao, et al.
Published: (2022)
by: Wu, Xingjiao, et al.
Published: (2022)
STAY Diffusion: Styled Layout Diffusion Model for Diverse Layout-to-Image Generation
by: Wang, Ruyu, et al.
Published: (2025)
by: Wang, Ruyu, et al.
Published: (2025)
uLayout: Unified Room Layout Estimation for Perspective and Panoramic Images
by: Lee, Jonathan, et al.
Published: (2025)
by: Lee, Jonathan, et al.
Published: (2025)
ComposeAnyone: Controllable Layout-to-Human Generation with Decoupled Multimodal Conditions
by: Zhang, Shiyue, et al.
Published: (2025)
by: Zhang, Shiyue, et al.
Published: (2025)
OmniDocLayout: Towards Diverse Document Layout Generation via Coarse-to-Fine LLM Learning
by: Kang, Hengrui, et al.
Published: (2025)
by: Kang, Hengrui, et al.
Published: (2025)
Masked Training for Robust Arrhythmia Detection from Digitalized Multiple Layout ECG Images
by: Zhang, Shanwei, et al.
Published: (2025)
by: Zhang, Shanwei, et al.
Published: (2025)
Uni-Layout: Integrating Human Feedback in Unified Layout Generation and Evaluation
by: Lu, Shuo, et al.
Published: (2025)
by: Lu, Shuo, et al.
Published: (2025)
Similar Items
-
VideoSAM: Open-World Video Segmentation
by: Guo, Pinxue, et al.
Published: (2024) -
MEMO: Human-like Crisp Edge Detection Using Masked Edge Prediction
by: Cheng, Jiaxin, et al.
Published: (2026) -
Towards Generalized Multimodal Homography Estimation
by: You, Jinkun, et al.
Published: (2026) -
Learning from Mistakes: Iterative Prompt Relabeling for Text-to-Image Diffusion Model Training
by: Chen, Xinyan, et al.
Published: (2023) -
Rethinking Training Dynamics in Scale-wise Autoregressive Generation
by: Zhou, Gengze, et al.
Published: (2025)