Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Ming, Wang, Yuhui, Shen, Yujiong, Yang, Tingyi, Jiang, Changhao, Wu, Yilong, Dou, Shihan, Chen, Qinhao, Xi, Zhiheng, Zhang, Zhihao, Dong, Yi, Wang, Zhen, Fei, Zhihui, Wan, Mingyang, Liang, Tao, Ma, Guojun, Zhang, Qi, Gui, Tao, Huang, Xuanjing
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2503.06706
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913891437510656
author	Zhang, Ming Wang, Yuhui Shen, Yujiong Yang, Tingyi Jiang, Changhao Wu, Yilong Dou, Shihan Chen, Qinhao Xi, Zhiheng Zhang, Zhihao Dong, Yi Wang, Zhen Fei, Zhihui Wan, Mingyang Liang, Tao Ma, Guojun Zhang, Qi Gui, Tao Huang, Xuanjing
author_facet	Zhang, Ming Wang, Yuhui Shen, Yujiong Yang, Tingyi Jiang, Changhao Wu, Yilong Dou, Shihan Chen, Qinhao Xi, Zhiheng Zhang, Zhihao Dong, Yi Wang, Zhen Fei, Zhihui Wan, Mingyang Liang, Tao Ma, Guojun Zhang, Qi Gui, Tao Huang, Xuanjing
contents	Process-driven dialogue systems, which operate under strict predefined process constraints, are essential in customer service and equipment maintenance scenarios. Although Large Language Models (LLMs) have shown remarkable progress in dialogue and reasoning, they still struggle to solve these strictly constrained dialogue tasks. To address this challenge, we construct Process Flow Dialogue (PFDial) dataset, which contains 12,705 high-quality Chinese dialogue instructions derived from 440 flowcharts containing 5,055 process nodes. Based on PlantUML specification, each UML flowchart is converted into atomic dialogue units i.e., structured five-tuples. Experimental results demonstrate that a 7B model trained with merely 800 samples, and a 0.5B model trained on total data both can surpass 90% accuracy. Additionally, the 8B model can surpass GPT-4o up to 43.88% with an average of 11.00%. We further evaluate models' performance on challenging backward transitions in process flows and conduct an in-depth analysis of various dataset formats to reveal their impact on model performance in handling decision and sequential branches. The data is released in https://github.com/KongLongGeFDU/PFDial.
format	Preprint
id	arxiv_https___arxiv_org_abs_2503_06706
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts Zhang, Ming Wang, Yuhui Shen, Yujiong Yang, Tingyi Jiang, Changhao Wu, Yilong Dou, Shihan Chen, Qinhao Xi, Zhiheng Zhang, Zhihao Dong, Yi Wang, Zhen Fei, Zhihui Wan, Mingyang Liang, Tao Ma, Guojun Zhang, Qi Gui, Tao Huang, Xuanjing Computation and Language Artificial Intelligence Machine Learning Process-driven dialogue systems, which operate under strict predefined process constraints, are essential in customer service and equipment maintenance scenarios. Although Large Language Models (LLMs) have shown remarkable progress in dialogue and reasoning, they still struggle to solve these strictly constrained dialogue tasks. To address this challenge, we construct Process Flow Dialogue (PFDial) dataset, which contains 12,705 high-quality Chinese dialogue instructions derived from 440 flowcharts containing 5,055 process nodes. Based on PlantUML specification, each UML flowchart is converted into atomic dialogue units i.e., structured five-tuples. Experimental results demonstrate that a 7B model trained with merely 800 samples, and a 0.5B model trained on total data both can surpass 90% accuracy. Additionally, the 8B model can surpass GPT-4o up to 43.88% with an average of 11.00%. We further evaluate models' performance on challenging backward transitions in process flows and conduct an in-depth analysis of various dataset formats to reveal their impact on model performance in handling decision and sequential branches. The data is released in https://github.com/KongLongGeFDU/PFDial.
title	PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts
topic	Computation and Language Artificial Intelligence Machine Learning
url	https://arxiv.org/abs/2503.06706

Similar Items