Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.19466 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866918072104779776 |
|---|---|
| author | Li, Cheng Liu, Jiexiong Chen, Yixuan Zhou, Qihang Meta, KunLun |
| author_facet | Li, Cheng Liu, Jiexiong Chen, Yixuan Zhou, Qihang Meta, KunLun |
| contents | This paper introduces KunLunBaizeRAG, a reinforcement learning-driven reasoning framework designed to enhance the reasoning capabilities of large language models (LLMs) in complex multi-hop question-answering tasks. The framework addresses key limitations of traditional RAG, such as retrieval drift, information redundancy, and strategy rigidity. Key innovations include the RAG-driven Reasoning Alignment (RDRA) mechanism, the Search-Think Iterative Enhancement (STIE) mechanism, the Network-Local Intelligent Routing (NLR) mechanism, and a progressive hybrid training strategy. Experimental results demonstrate significant improvements in exact match (EM) and LLM-judged score (LJ) across four benchmarks, highlighting the framework's robustness and effectiveness in complex reasoning scenarios. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2506_19466 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | KunLunBaizeRAG: Reinforcement Learning Driven Inference Performance Leap for Large Language Models Li, Cheng Liu, Jiexiong Chen, Yixuan Zhou, Qihang Meta, KunLun Artificial Intelligence This paper introduces KunLunBaizeRAG, a reinforcement learning-driven reasoning framework designed to enhance the reasoning capabilities of large language models (LLMs) in complex multi-hop question-answering tasks. The framework addresses key limitations of traditional RAG, such as retrieval drift, information redundancy, and strategy rigidity. Key innovations include the RAG-driven Reasoning Alignment (RDRA) mechanism, the Search-Think Iterative Enhancement (STIE) mechanism, the Network-Local Intelligent Routing (NLR) mechanism, and a progressive hybrid training strategy. Experimental results demonstrate significant improvements in exact match (EM) and LLM-judged score (LJ) across four benchmarks, highlighting the framework's robustness and effectiveness in complex reasoning scenarios. |
| title | KunLunBaizeRAG: Reinforcement Learning Driven Inference Performance Leap for Large Language Models |
| topic | Artificial Intelligence |
| url | https://arxiv.org/abs/2506.19466 |