Saved in:
| Main Authors: | Dang, Shengqi, He, Yi, Lei, Jiaying, Qian, Ziqing, Cao, Nan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.09286 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model
by: Dang, Shengqi, et al.
Published: (2025)
by: Dang, Shengqi, et al.
Published: (2025)
CogMorph: Cognitive Morphing Attacks for Text-to-Image Models
by: Jing, Zonglei, et al.
Published: (2025)
by: Jing, Zonglei, et al.
Published: (2025)
DensiCrafter: Physically-Constrained Generation and Fabrication of Self-Supporting Hollow Structures
by: Dang, Shengqi, et al.
Published: (2025)
by: Dang, Shengqi, et al.
Published: (2025)
CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion
by: Zheng, Wendi, et al.
Published: (2024)
by: Zheng, Wendi, et al.
Published: (2024)
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
by: Lv, Jiaxi, et al.
Published: (2023)
by: Lv, Jiaxi, et al.
Published: (2023)
Designed to Spread: A Generative Approach to Enhance Information Diffusion
by: Qian, Ziqing, et al.
Published: (2025)
by: Qian, Ziqing, et al.
Published: (2025)
DiffBlender: Composable and Versatile Multimodal Text-to-Image Diffusion Models
by: Kim, Sungnyun, et al.
Published: (2023)
by: Kim, Sungnyun, et al.
Published: (2023)
DyCoRM: Dynamic Criterion-Aware Reward Modeling for Text-to-Image Generation
by: Qian, Jiaying, et al.
Published: (2026)
by: Qian, Jiaying, et al.
Published: (2026)
CogDoc: Towards Unified thinking in Documents
by: Xu, Qixin, et al.
Published: (2025)
by: Xu, Qixin, et al.
Published: (2025)
CogOmniControl: Reasoning-Driven Controllable Video Generation via Creative Intent Cognition
by: Yang, Hongji, et al.
Published: (2026)
by: Yang, Hongji, et al.
Published: (2026)
Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators
by: Yuan, Jianhao, et al.
Published: (2022)
by: Yuan, Jianhao, et al.
Published: (2022)
CogStereo: Neural Stereo Matching with Implicit Spatial Cognition Embedding
by: Fang, Lihuang, et al.
Published: (2025)
by: Fang, Lihuang, et al.
Published: (2025)
Thinking in Blender: Staged Executable Inverse Graphics with Vision-Language Models
by: He, Guangzhao, et al.
Published: (2026)
by: He, Guangzhao, et al.
Published: (2026)
Blendify -- Python rendering framework for Blender
by: Guzov, Vladimir, et al.
Published: (2024)
by: Guzov, Vladimir, et al.
Published: (2024)
CogDriver: Integrating Cognitive Inertia for Temporally Coherent Planning in Autonomous Driving
by: Liu, Pei, et al.
Published: (2025)
by: Liu, Pei, et al.
Published: (2025)
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
by: Yang, Zhuoyi, et al.
Published: (2024)
by: Yang, Zhuoyi, et al.
Published: (2024)
CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification
by: Li, Wei, et al.
Published: (2025)
by: Li, Wei, et al.
Published: (2025)
Cog2Gen3D: Sculpturing 3D Semantic-Geometric Cognition for 3D Generation
by: Wang, Haonan, et al.
Published: (2026)
by: Wang, Haonan, et al.
Published: (2026)
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing
by: Chen, Jiacheng, et al.
Published: (2025)
by: Chen, Jiacheng, et al.
Published: (2025)
EgoCogNav: Cognition-aware Human Egocentric Navigation
by: Qiu, Zhiwen, et al.
Published: (2025)
by: Qiu, Zhiwen, et al.
Published: (2025)
CogVLM2: Visual Language Models for Image and Video Understanding
by: Hong, Wenyi, et al.
Published: (2024)
by: Hong, Wenyi, et al.
Published: (2024)
Frequency-Controlled Diffusion Model for Versatile Text-Guided Image-to-Image Translation
by: Gao, Xiang, et al.
Published: (2024)
by: Gao, Xiang, et al.
Published: (2024)
Why Settle for One? Text-to-ImageSet Generation and Evaluation
by: Jia, Chengyou, et al.
Published: (2025)
by: Jia, Chengyou, et al.
Published: (2025)
Towards Explainable Partial-AIGC Image Quality Assessment
by: Qian, Jiaying, et al.
Published: (2025)
by: Qian, Jiaying, et al.
Published: (2025)
CogPortrait: Fine-Grained Eye-Region Control in Portrait Animation via Hierarchical Agent Planning
by: Feng, He, et al.
Published: (2026)
by: Feng, He, et al.
Published: (2026)
Swap Attention in Spatiotemporal Diffusions for Text-to-Video Generation
by: Wang, Wenjing, et al.
Published: (2023)
by: Wang, Wenjing, et al.
Published: (2023)
CogNav: Cognitive Process Modeling for Object Goal Navigation with LLMs
by: Cao, Yihan, et al.
Published: (2024)
by: Cao, Yihan, et al.
Published: (2024)
Cog3DMap: Multi-View Vision-Language Reasoning with 3D Cognitive Maps
by: Gwak, Chanyoung, et al.
Published: (2026)
by: Gwak, Chanyoung, et al.
Published: (2026)
ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting
by: Jia, Chengyou, et al.
Published: (2024)
by: Jia, Chengyou, et al.
Published: (2024)
Motion Blender Gaussian Splatting for Dynamic Scene Reconstruction
by: Zhang, Xinyu, et al.
Published: (2025)
by: Zhang, Xinyu, et al.
Published: (2025)
PhyCustom: Towards Realistic Physical Customization in Text-to-Image Generation
by: Wu, Fan, et al.
Published: (2025)
by: Wu, Fan, et al.
Published: (2025)
Multi-Grained Text-Guided Image Fusion for Multi-Exposure and Multi-Focus Scenarios
by: Tang, Mingwei, et al.
Published: (2025)
by: Tang, Mingwei, et al.
Published: (2025)
ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving
by: Li, Yongkang, et al.
Published: (2025)
by: Li, Yongkang, et al.
Published: (2025)
MultiBooth: Towards Generating All Your Concepts in an Image from Text
by: Zhu, Chenyang, et al.
Published: (2024)
by: Zhu, Chenyang, et al.
Published: (2024)
FreqBlender: Enhancing DeepFake Detection by Blending Frequency Knowledge
by: Li, Hanzhe, et al.
Published: (2024)
by: Li, Hanzhe, et al.
Published: (2024)
Long-range Turbulence Mitigation: A Large-scale Dataset and A Coarse-to-fine Framework
by: Xu, Shengqi, et al.
Published: (2024)
by: Xu, Shengqi, et al.
Published: (2024)
OmniFM: Toward Modality-Robust and Task-Agnostic Federated Learning for Heterogeneous Medical Imaging
by: Liu, Meilin, et al.
Published: (2026)
by: Liu, Meilin, et al.
Published: (2026)
CogVLM: Visual Expert for Pretrained Language Models
by: Wang, Weihan, et al.
Published: (2023)
by: Wang, Weihan, et al.
Published: (2023)
DTVI: Dual-Stage Textual and Visual Intervention for Safe Text-to-Image Generation
by: Tan, Binhong, et al.
Published: (2026)
by: Tan, Binhong, et al.
Published: (2026)
Text-DiFuse: An Interactive Multi-Modal Image Fusion Framework based on Text-modulated Diffusion Model
by: Zhang, Hao, et al.
Published: (2024)
by: Zhang, Hao, et al.
Published: (2024)
Similar Items
-
EmotiCrafter: Text-to-Emotional-Image Generation based on Valence-Arousal Model
by: Dang, Shengqi, et al.
Published: (2025) -
CogMorph: Cognitive Morphing Attacks for Text-to-Image Models
by: Jing, Zonglei, et al.
Published: (2025) -
DensiCrafter: Physically-Constrained Generation and Fabrication of Self-Supporting Hollow Structures
by: Dang, Shengqi, et al.
Published: (2025) -
CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion
by: Zheng, Wendi, et al.
Published: (2024) -
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
by: Lv, Jiaxi, et al.
Published: (2023)