Saved in:
| Main Authors: | Tschannen, Michael, Pinto, André Susano, Kolesnikov, Alexander |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.19722 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Jet: A Modern Transformer-Based Normalizing Flow
by: Kolesnikov, Alexander, et al.
Published: (2024)
by: Kolesnikov, Alexander, et al.
Published: (2024)
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
by: Stanić, Aleksandar, et al.
Published: (2024)
by: Stanić, Aleksandar, et al.
Published: (2024)
Enhancing Diffusion Models for High-Quality Image Generation
by: Shah, Jaineet, et al.
Published: (2024)
by: Shah, Jaineet, et al.
Published: (2024)
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
by: Yuan, Zhihang, et al.
Published: (2024)
by: Yuan, Zhihang, et al.
Published: (2024)
VTBench: Evaluating Visual Tokenizers for Autoregressive Image Generation
by: Lin, Huawei, et al.
Published: (2025)
by: Lin, Huawei, et al.
Published: (2025)
Annealed Relaxation of Speculative Decoding for Faster Autoregressive Image Generation
by: Li, Xingyao, et al.
Published: (2026)
by: Li, Xingyao, et al.
Published: (2026)
Watermarking Autoregressive Image Generation
by: Jovanović, Nikola, et al.
Published: (2025)
by: Jovanović, Nikola, et al.
Published: (2025)
PaliGemma: A versatile 3B VLM for transfer
by: Beyer, Lucas, et al.
Published: (2024)
by: Beyer, Lucas, et al.
Published: (2024)
Astra: General Interactive World Model with Autoregressive Denoising
by: Zhu, Yixuan, et al.
Published: (2025)
by: Zhu, Yixuan, et al.
Published: (2025)
MASC: Boosting Autoregressive Image Generation with a Manifold-Aligned Semantic Clustering
by: He, Lixuan, et al.
Published: (2025)
by: He, Lixuan, et al.
Published: (2025)
Optimizing Negative Prompts for Enhanced Aesthetics and Fidelity in Text-To-Image Generation
by: Ogezi, Michael, et al.
Published: (2024)
by: Ogezi, Michael, et al.
Published: (2024)
Contextualized Diffusion Models for Text-Guided Image and Video Generation
by: Yang, Ling, et al.
Published: (2024)
by: Yang, Ling, et al.
Published: (2024)
DocSynthv2: A Practical Autoregressive Modeling for Document Generation
by: Biswas, Sanket, et al.
Published: (2024)
by: Biswas, Sanket, et al.
Published: (2024)
Text-To-Image with Generative Adversarial Networks
by: Momen-Tayefeh, Mehrshad
Published: (2024)
by: Momen-Tayefeh, Mehrshad
Published: (2024)
Identity Curvature Laplace Approximation for Improved Out-of-Distribution Detection
by: Zhdanov, Maksim, et al.
Published: (2023)
by: Zhdanov, Maksim, et al.
Published: (2023)
RL for Consistency Models: Faster Reward Guided Text-to-Image Generation
by: Oertell, Owen, et al.
Published: (2024)
by: Oertell, Owen, et al.
Published: (2024)
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation
by: Zhang, Yuhui, et al.
Published: (2023)
by: Zhang, Yuhui, et al.
Published: (2023)
SpectralAR: Spectral Autoregressive Visual Generation
by: Huang, Yuanhui, et al.
Published: (2025)
by: Huang, Yuanhui, et al.
Published: (2025)
On the Scalability of Diffusion-based Text-to-Image Generation
by: Li, Hao, et al.
Published: (2024)
by: Li, Hao, et al.
Published: (2024)
EdgeFusion: On-Device Text-to-Image Generation
by: Castells, Thibault, et al.
Published: (2024)
by: Castells, Thibault, et al.
Published: (2024)
MetaFormer Baselines for Vision
by: Yu, Weihao, et al.
Published: (2022)
by: Yu, Weihao, et al.
Published: (2022)
PoGDiff: Product-of-Gaussians Diffusion Models for Imbalanced Text-to-Image Generation
by: Wang, Ziyan, et al.
Published: (2025)
by: Wang, Ziyan, et al.
Published: (2025)
ReText: Text Boosts Generalization in Image-Based Person Re-identification
by: Mamedov, Timur, et al.
Published: (2026)
by: Mamedov, Timur, et al.
Published: (2026)
Image Captions are Natural Prompts for Text-to-Image Models
by: Lei, Shiye, et al.
Published: (2023)
by: Lei, Shiye, et al.
Published: (2023)
HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
by: Tang, Haotian, et al.
Published: (2024)
by: Tang, Haotian, et al.
Published: (2024)
Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation
by: Franchi, Gianni, et al.
Published: (2024)
by: Franchi, Gianni, et al.
Published: (2024)
Compositional Text-to-Image Generation with Dense Blob Representations
by: Nie, Weili, et al.
Published: (2024)
by: Nie, Weili, et al.
Published: (2024)
ScribFormer: Transformer Makes CNN Work Better for Scribble-based Medical Image Segmentation
by: Li, Zihan, et al.
Published: (2024)
by: Li, Zihan, et al.
Published: (2024)
Soft-TransFormers for Continual Learning
by: Kang, Haeyong, et al.
Published: (2024)
by: Kang, Haeyong, et al.
Published: (2024)
Diversifying Deep Ensembles: A Saliency Map Approach for Enhanced OOD Detection, Calibration, and Accuracy
by: Dereka, Stanislav, et al.
Published: (2023)
by: Dereka, Stanislav, et al.
Published: (2023)
Text-Aware Image Restoration with Diffusion Models
by: Min, Jaewon, et al.
Published: (2025)
by: Min, Jaewon, et al.
Published: (2025)
Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation
by: Croitoru, Florinel-Alin, et al.
Published: (2026)
by: Croitoru, Florinel-Alin, et al.
Published: (2026)
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation
by: Seo, Hoigi, et al.
Published: (2025)
by: Seo, Hoigi, et al.
Published: (2025)
UNCAGE: Contrastive Attention Guidance for Masked Generative Transformers in Text-to-Image Generation
by: Kang, Wonjun, et al.
Published: (2025)
by: Kang, Wonjun, et al.
Published: (2025)
AlignGuard: Scalable Safety Alignment for Text-to-Image Generation
by: Liu, Runtao, et al.
Published: (2024)
by: Liu, Runtao, et al.
Published: (2024)
Minority-Focused Text-to-Image Generation via Prompt Optimization
by: Um, Soobin, et al.
Published: (2024)
by: Um, Soobin, et al.
Published: (2024)
Self-Evaluation Unlocks Any-Step Text-to-Image Generation
by: Yu, Xin, et al.
Published: (2025)
by: Yu, Xin, et al.
Published: (2025)
PreciseCam: Precise Camera Control for Text-to-Image Generation
by: Bernal-Berdun, Edurne, et al.
Published: (2025)
by: Bernal-Berdun, Edurne, et al.
Published: (2025)
Understanding Implosion in Text-to-Image Generative Models
by: Ding, Wenxin, et al.
Published: (2024)
by: Ding, Wenxin, et al.
Published: (2024)
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation
by: Lin, Shanchuan, et al.
Published: (2025)
by: Lin, Shanchuan, et al.
Published: (2025)
Similar Items
-
Jet: A Modern Transformer-Based Normalizing Flow
by: Kolesnikov, Alexander, et al.
Published: (2024) -
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
by: Stanić, Aleksandar, et al.
Published: (2024) -
Enhancing Diffusion Models for High-Quality Image Generation
by: Shah, Jaineet, et al.
Published: (2024) -
E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
by: Yuan, Zhihang, et al.
Published: (2024) -
VTBench: Evaluating Visual Tokenizers for Autoregressive Image Generation
by: Lin, Huawei, et al.
Published: (2025)