Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Xia, Xiao, Zhang, Dan, Liao, Zibo, Hou, Zhenyu, Sun, Tianrui, Li, Jing, Fu, Ling, Dong, Yuxiao
Format: Preprint
Veröffentlicht: 2024
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2410.21909
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866915359749046272
author Xia, Xiao
Zhang, Dan
Liao, Zibo
Hou, Zhenyu
Sun, Tianrui
Li, Jing
Fu, Ling
Dong, Yuxiao
author_facet Xia, Xiao
Zhang, Dan
Liao, Zibo
Hou, Zhenyu
Sun, Tianrui
Li, Jing
Fu, Ling
Dong, Yuxiao
contents The modeling of industrial scenes is essential for simulations in industrial manufacturing. While large language models (LLMs) have shown significant progress in generating general 3D scenes from textual descriptions, generating industrial scenes with LLMs poses a unique challenge due to their demand for precise measurements and positioning, requiring complex planning over spatial arrangement. To address this challenge, we introduce SceneGenAgent, an LLM-based agent for generating industrial scenes through C# code. SceneGenAgent ensures precise layout planning through a structured and calculable format, layout verification, and iterative refinement to meet the quantitative requirements of industrial scenarios. Experiment results demonstrate that LLMs powered by SceneGenAgent exceed their original performance, reaching up to 81.0% success rate in real-world industrial scene generation tasks and effectively meeting most scene generation requirements. To further enhance accessibility, we construct SceneInstruct, a dataset designed for fine-tuning open-source LLMs to integrate into SceneGenAgent. Experiments show that fine-tuning open-source LLMs on SceneInstruct yields significant performance improvements, with Llama3.1-70B approaching the capabilities of GPT-4o. Our code and data are available at https://github.com/THUDM/SceneGenAgent .
format Preprint
id arxiv_https___arxiv_org_abs_2410_21909
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle SceneGenAgent: Precise Industrial Scene Generation with Coding Agent
Xia, Xiao
Zhang, Dan
Liao, Zibo
Hou, Zhenyu
Sun, Tianrui
Li, Jing
Fu, Ling
Dong, Yuxiao
Computation and Language
Machine Learning
Software Engineering
The modeling of industrial scenes is essential for simulations in industrial manufacturing. While large language models (LLMs) have shown significant progress in generating general 3D scenes from textual descriptions, generating industrial scenes with LLMs poses a unique challenge due to their demand for precise measurements and positioning, requiring complex planning over spatial arrangement. To address this challenge, we introduce SceneGenAgent, an LLM-based agent for generating industrial scenes through C# code. SceneGenAgent ensures precise layout planning through a structured and calculable format, layout verification, and iterative refinement to meet the quantitative requirements of industrial scenarios. Experiment results demonstrate that LLMs powered by SceneGenAgent exceed their original performance, reaching up to 81.0% success rate in real-world industrial scene generation tasks and effectively meeting most scene generation requirements. To further enhance accessibility, we construct SceneInstruct, a dataset designed for fine-tuning open-source LLMs to integrate into SceneGenAgent. Experiments show that fine-tuning open-source LLMs on SceneInstruct yields significant performance improvements, with Llama3.1-70B approaching the capabilities of GPT-4o. Our code and data are available at https://github.com/THUDM/SceneGenAgent .
title SceneGenAgent: Precise Industrial Scene Generation with Coding Agent
topic Computation and Language
Machine Learning
Software Engineering
url https://arxiv.org/abs/2410.21909