Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Gan, Jinling, Liang, Churong, Li, Runnan
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2510.08175
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916999258439680
author	Gan, Jinling Liang, Churong Li, Runnan
author_facet	Gan, Jinling Liang, Churong Li, Runnan
contents	The latency-quality tradeoff is a fundamental constraint in open-domain dialogue AI systems, since comprehensive knowledge access necessitates prohibitive response delays. Contemporary approaches offer two inadequate solutions: lightweight instruct models achieve sub-second latency but lack reasoning depth, while tool-augmented ReAct agents enhance factuality through external knowledge at the cost of synchronous execution that blocks interaction during retrieval processes. PMFR is thus proposed, with a temporal decoupling framework that fundamentally resolves the contradiction through asynchronous knowledge orchestration. PMFR employs three coordinated components: (1) a Knowledge Adequacy Evaluator for real-time sufficiency assessment, (2) a Lightweight Response Generator for immediate user interaction, and (3) an Asynchronous Knowledge Refinement Agent for background knowledge enhancement. This architecture maintains continuous conversational flow while progressively enriching knowledge coverage through intelligent triggering mechanisms. Evaluation results on TopiOCQA demonstrate PMFR outperforms brute-force scaling: PMFR achieves 95.3% latency reduction (23.38s -> 1.09s) while preserving response quality comparable to heavyweight synchronous baselines (GEval-C: 0.613 vs. 0.620).
format	Preprint
id	arxiv_https___arxiv_org_abs_2510_08175
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Prepared mind, fast response: A temporal decoupling framework for adaptive knowledge orchestration in open-domain dialogue Gan, Jinling Liang, Churong Li, Runnan Artificial Intelligence The latency-quality tradeoff is a fundamental constraint in open-domain dialogue AI systems, since comprehensive knowledge access necessitates prohibitive response delays. Contemporary approaches offer two inadequate solutions: lightweight instruct models achieve sub-second latency but lack reasoning depth, while tool-augmented ReAct agents enhance factuality through external knowledge at the cost of synchronous execution that blocks interaction during retrieval processes. PMFR is thus proposed, with a temporal decoupling framework that fundamentally resolves the contradiction through asynchronous knowledge orchestration. PMFR employs three coordinated components: (1) a Knowledge Adequacy Evaluator for real-time sufficiency assessment, (2) a Lightweight Response Generator for immediate user interaction, and (3) an Asynchronous Knowledge Refinement Agent for background knowledge enhancement. This architecture maintains continuous conversational flow while progressively enriching knowledge coverage through intelligent triggering mechanisms. Evaluation results on TopiOCQA demonstrate PMFR outperforms brute-force scaling: PMFR achieves 95.3% latency reduction (23.38s -> 1.09s) while preserving response quality comparable to heavyweight synchronous baselines (GEval-C: 0.613 vs. 0.620).
title	Prepared mind, fast response: A temporal decoupling framework for adaptive knowledge orchestration in open-domain dialogue
topic	Artificial Intelligence
url	https://arxiv.org/abs/2510.08175

Similar Items