Saved in:
| Main Authors: | Xiao, Hongcan, Xiao, Xinyue, Wang, Yilin, Zhang, Yue, Qi, Yonggang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.08042 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
StickMotion: Generating 3D Human Motions by Drawing a Stickman
by: Wang, Tao, et al.
Published: (2025)
by: Wang, Tao, et al.
Published: (2025)
ShadowDraw: From Any Object to Shadow-Drawing Compositional Art
by: Luo, Rundong, et al.
Published: (2025)
by: Luo, Rundong, et al.
Published: (2025)
VS-LLM: Visual-Semantic Depression Assessment based on LLM for Drawing Projection Test
by: Wu, Meiqi, et al.
Published: (2025)
by: Wu, Meiqi, et al.
Published: (2025)
From Drawings to Decisions: A Hybrid Vision-Language Framework for Parsing 2D Engineering Drawings into Structured Manufacturing Knowledge
by: Khan, Muhammad Tayyab, et al.
Published: (2025)
by: Khan, Muhammad Tayyab, et al.
Published: (2025)
Text-Enhanced Panoptic Symbol Spotting in CAD Drawings
by: Liu, Xianlin, et al.
Published: (2025)
by: Liu, Xianlin, et al.
Published: (2025)
ViRED: Prediction of Visual Relations in Engineering Drawings
by: Gu, Chao, et al.
Published: (2024)
by: Gu, Chao, et al.
Published: (2024)
SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model
by: Chang, Yifan, et al.
Published: (2025)
by: Chang, Yifan, et al.
Published: (2025)
MechVQA: Benchmarking and Enhancing Multimodal LLMs on Comprehensive Mechanical Drawing Understanding
by: Kou, Qian, et al.
Published: (2026)
by: Kou, Qian, et al.
Published: (2026)
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing
by: Wu, Junfei, et al.
Published: (2025)
by: Wu, Junfei, et al.
Published: (2025)
DrawMotion: Generating 3D Human Motions by Freehand Drawing
by: Wang, Tao, et al.
Published: (2026)
by: Wang, Tao, et al.
Published: (2026)
Fine-Tuning Vision-Language Model for Automated Engineering Drawing Information Extraction
by: Khan, Muhammad Tayyab, et al.
Published: (2024)
by: Khan, Muhammad Tayyab, et al.
Published: (2024)
PhyDrawGen: Physically Grounded Diagram Generation from Natural Language
by: Haque, Nafiul, et al.
Published: (2026)
by: Haque, Nafiul, et al.
Published: (2026)
Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing
by: Zeng, Ziyun, et al.
Published: (2025)
by: Zeng, Ziyun, et al.
Published: (2025)
DrawVideo: Generating Long Video from Storyboard Keyframe Sketches
by: Xu, Chuanzhi, et al.
Published: (2026)
by: Xu, Chuanzhi, et al.
Published: (2026)
The Image Reconstruction Game: Drawing Common Ground Through Iterative Multimodal Dialogue
by: Hakimov, Sherzod, et al.
Published: (2026)
by: Hakimov, Sherzod, et al.
Published: (2026)
Few Channels Draw The Whole Picture: Revealing Massive Activations in Diffusion Transformers
by: Turri, Evelyn, et al.
Published: (2026)
by: Turri, Evelyn, et al.
Published: (2026)
PCEvE: Part Contribution Evaluation Based Model Explanation for Human Figure Drawing Assessment and Beyond
by: Lee, Jongseo, et al.
Published: (2024)
by: Lee, Jongseo, et al.
Published: (2024)
Think-Before-Draw: Decomposing Emotion Semantics & Fine-Grained Controllable Expressive Talking Head Generation
by: Shi, Hanlei, et al.
Published: (2025)
by: Shi, Hanlei, et al.
Published: (2025)
Generating Sketches in a Hierarchical Auto-Regressive Process for Flexible Sketch Drawing Manipulation at Stroke-Level
by: Zang, Sicong, et al.
Published: (2025)
by: Zang, Sicong, et al.
Published: (2025)
PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion
by: Lu, Guansong, et al.
Published: (2023)
by: Lu, Guansong, et al.
Published: (2023)
Drawing the Line: Deep Segmentation for Extracting Art from Ancient Etruscan Mirrors
by: Sterzinger, Rafael, et al.
Published: (2024)
by: Sterzinger, Rafael, et al.
Published: (2024)
Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer
by: Khan, Muhammad Tayyab, et al.
Published: (2025)
by: Khan, Muhammad Tayyab, et al.
Published: (2025)
PyPotteryInk: One-Step Diffusion Model for Sketch to Publication-ready Archaeological Drawings
by: Cardarelli, Lorenzo
Published: (2025)
by: Cardarelli, Lorenzo
Published: (2025)
Pencils to Pixels: A Systematic Study of Creative Drawings across Children, Adults and AI
by: Nath, Surabhi S, et al.
Published: (2025)
by: Nath, Surabhi S, et al.
Published: (2025)
How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM
by: Zha, Jirong, et al.
Published: (2025)
by: Zha, Jirong, et al.
Published: (2025)
Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models
by: Kim, Hyungjin, et al.
Published: (2025)
by: Kim, Hyungjin, et al.
Published: (2025)
Spatial 3D-LLM: Exploring Spatial Awareness in 3D Vision-Language Models
by: Wang, Xiaoyan, et al.
Published: (2025)
by: Wang, Xiaoyan, et al.
Published: (2025)
3D-Agent:Tri-Modal Multi-Agent Collaboration for Scalable 3D Object Annotation
by: Zhang, Jusheng, et al.
Published: (2026)
by: Zhang, Jusheng, et al.
Published: (2026)
3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding
by: Ogunleye, Makanjuola, et al.
Published: (2026)
by: Ogunleye, Makanjuola, et al.
Published: (2026)
RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding
by: Yang, Jihan, et al.
Published: (2023)
by: Yang, Jihan, et al.
Published: (2023)
Is Contrastive Distillation Enough for Learning Comprehensive 3D Representations?
by: Zhang, Yifan, et al.
Published: (2024)
by: Zhang, Yifan, et al.
Published: (2024)
Semantic Aware Feature Extraction for Enhanced 3D Reconstruction
by: Nap, Ronald, et al.
Published: (2026)
by: Nap, Ronald, et al.
Published: (2026)
Speed3R: Sparse Feed-forward 3D Reconstruction Models
by: Ren, Weining, et al.
Published: (2026)
by: Ren, Weining, et al.
Published: (2026)
Real-Time Intuitive AI Drawing System for Collaboration: Enhancing Human Creativity through Formal and Contextual Intent Integration
by: Song, Jookyung, et al.
Published: (2025)
by: Song, Jookyung, et al.
Published: (2025)
A Multi-Stage Hybrid Framework for Automated Interpretation of Multi-View Engineering Drawings Using Vision Language Model
by: Khan, Muhammad Tayyab, et al.
Published: (2025)
by: Khan, Muhammad Tayyab, et al.
Published: (2025)
Wonder3D++: Cross-domain Diffusion for High-fidelity 3D Generation from a Single Image
by: Yang, Yuxiao, et al.
Published: (2025)
by: Yang, Yuxiao, et al.
Published: (2025)
MUSES: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration
by: Ding, Yanbo, et al.
Published: (2024)
by: Ding, Yanbo, et al.
Published: (2024)
Spatio-Temporal Token Pruning for Efficient High-Resolution GUI Agents
by: Xu, Zhou, et al.
Published: (2026)
by: Xu, Zhou, et al.
Published: (2026)
TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection
by: Qi, Qiang, et al.
Published: (2025)
by: Qi, Qiang, et al.
Published: (2025)
Hyperbolic Contrastive Learning for Hierarchical 3D Point Cloud Embedding
by: Liu, Yingjie, et al.
Published: (2025)
by: Liu, Yingjie, et al.
Published: (2025)
Similar Items
-
StickMotion: Generating 3D Human Motions by Drawing a Stickman
by: Wang, Tao, et al.
Published: (2025) -
ShadowDraw: From Any Object to Shadow-Drawing Compositional Art
by: Luo, Rundong, et al.
Published: (2025) -
VS-LLM: Visual-Semantic Depression Assessment based on LLM for Drawing Projection Test
by: Wu, Meiqi, et al.
Published: (2025) -
From Drawings to Decisions: A Hybrid Vision-Language Framework for Parsing 2D Engineering Drawings into Structured Manufacturing Knowledge
by: Khan, Muhammad Tayyab, et al.
Published: (2025) -
Text-Enhanced Panoptic Symbol Spotting in CAD Drawings
by: Liu, Xianlin, et al.
Published: (2025)