Saved in:
| Main Authors: | Wang, Yunxiao, Liu, Meng, Jiang, Kaiyu, Wen, Bin, Yang, Fan, Gao, Tingting, Liao, Lizi |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.09521 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
STRIDE-ED: A Strategy-Grounded Stepwise Reasoning Framework for Empathetic Dialogue Systems
by: Ji, Hongru, et al.
Published: (2026)
by: Ji, Hongru, et al.
Published: (2026)
Debate, Reflect, and Distill: Multi-Agent Feedback with Tree-Structured Preference Optimization for Efficient Language Model Enhancement
by: Zhou, Xiaofeng, et al.
Published: (2025)
by: Zhou, Xiaofeng, et al.
Published: (2025)
Aligned Multi-View Scripts for Universal Chart-to-Code Generation
by: Zhang, Zhihan, et al.
Published: (2026)
by: Zhang, Zhihan, et al.
Published: (2026)
Internalizing Outcome Supervision into Process Supervision: A New Paradigm for Reinforcement Learning for Reasoning
by: Ding, Fei, et al.
Published: (2026)
by: Ding, Fei, et al.
Published: (2026)
Kwai-STaR: Transform LLMs into State-Transition Reasoners
by: Lu, Xingyu, et al.
Published: (2024)
by: Lu, Xingyu, et al.
Published: (2024)
RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents
by: Wang, Peisong, et al.
Published: (2025)
by: Wang, Peisong, et al.
Published: (2025)
Boosting Chart-to-Code Generation in MLLM via Dual Preference-Guided Refinement
by: Zhang, Zhihan, et al.
Published: (2025)
by: Zhang, Zhihan, et al.
Published: (2025)
Empathy Level Alignment via Reinforcement Learning for Empathetic Response Generation
by: Ma, Hui, et al.
Published: (2024)
by: Ma, Hui, et al.
Published: (2024)
CTSM: Combining Trait and State Emotions for Empathetic Response Model
by: Yufeng, Wang, et al.
Published: (2024)
by: Yufeng, Wang, et al.
Published: (2024)
ContextRL: Enhancing MLLM's Knowledge Discovery Efficiency with Context-Augmented RL
by: Lu, Xingyu, et al.
Published: (2026)
by: Lu, Xingyu, et al.
Published: (2026)
Markov Chain of Thought for Efficient Mathematical Reasoning
by: Yang, Wen, et al.
Published: (2024)
by: Yang, Wen, et al.
Published: (2024)
UR$^2$: Unify RAG and Reasoning through Reinforcement Learning
by: Li, Weitao, et al.
Published: (2025)
by: Li, Weitao, et al.
Published: (2025)
From Reasoning to Learning: A Survey on Hypothesis Discovery and Rule Learning with Large Language Models
by: He, Kaiyu, et al.
Published: (2025)
by: He, Kaiyu, et al.
Published: (2025)
Exploiting Emotion-Semantic Correlations for Empathetic Response Generation
by: Yang, Zhou, et al.
Published: (2024)
by: Yang, Zhou, et al.
Published: (2024)
SAKE: Structured Agentic Knowledge Extrapolation for Complex LLM Reasoning via Reinforcement Learning
by: He, Jiashu, et al.
Published: (2025)
by: He, Jiashu, et al.
Published: (2025)
A Survey on Neural Question Generation: Methods, Applications, and Prospects
by: Guo, Shasha, et al.
Published: (2024)
by: Guo, Shasha, et al.
Published: (2024)
VLM as Policy: Common-Law Content Moderation Framework for Short Video Platform
by: Lu, Xingyu, et al.
Published: (2025)
by: Lu, Xingyu, et al.
Published: (2025)
Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation
by: Yu, Zhuohao, et al.
Published: (2024)
by: Yu, Zhuohao, et al.
Published: (2024)
PCQPR: Proactive Conversational Question Planning with Reflection
by: Guo, Shasha, et al.
Published: (2024)
by: Guo, Shasha, et al.
Published: (2024)
Beyond Outcome Verification: Verifiable Process Reward Models for Structured Reasoning
by: Pronesti, Massimiliano, et al.
Published: (2026)
by: Pronesti, Massimiliano, et al.
Published: (2026)
PCL-Reasoner-V1.5: Advancing Math Reasoning with Offline Reinforcement Learning
by: Lu, Yao, et al.
Published: (2026)
by: Lu, Yao, et al.
Published: (2026)
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs
by: Wen, Xumeng, et al.
Published: (2025)
by: Wen, Xumeng, et al.
Published: (2025)
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
by: Chen, Mingyang, et al.
Published: (2025)
by: Chen, Mingyang, et al.
Published: (2025)
SGSH: Stimulate Large Language Models with Skeleton Heuristics for Knowledge Base Question Generation
by: Guo, Shasha, et al.
Published: (2024)
by: Guo, Shasha, et al.
Published: (2024)
Unmasking Reasoning Processes: A Process-aware Benchmark for Evaluating Structural Mathematical Reasoning in LLMs
by: Zheng, Xiang, et al.
Published: (2026)
by: Zheng, Xiang, et al.
Published: (2026)
From Reasoning Chains to Verifiable Subproblems: Curriculum Reinforcement Learning Enables Credit Assignment for LLM Reasoning
by: Jiang, Xitai, et al.
Published: (2026)
by: Jiang, Xitai, et al.
Published: (2026)
VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning
by: Lu, Xingyu, et al.
Published: (2026)
by: Lu, Xingyu, et al.
Published: (2026)
Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models
by: Chen, Guanxu, et al.
Published: (2025)
by: Chen, Guanxu, et al.
Published: (2025)
DeepTrans: Deep Reasoning Translation via Reinforcement Learning
by: Wang, Jiaan, et al.
Published: (2025)
by: Wang, Jiaan, et al.
Published: (2025)
InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning
by: Yan, Yuchen, et al.
Published: (2026)
by: Yan, Yuchen, et al.
Published: (2026)
Cause-Aware Empathetic Response Generation via Chain-of-Thought Fine-Tuning
by: Chen, Xinhao, et al.
Published: (2024)
by: Chen, Xinhao, et al.
Published: (2024)
ReCode: Reinforcing Code Generation with Reasoning-Process Rewards
by: Fan, Lishui, et al.
Published: (2025)
by: Fan, Lishui, et al.
Published: (2025)
VideoTemp-o3: Harmonizing Temporal Grounding and Video Understanding in Agentic Thinking-with-Videos
by: Liu, Wenqi, et al.
Published: (2026)
by: Liu, Wenqi, et al.
Published: (2026)
STICKERCONV: Generating Multimodal Empathetic Responses from Scratch
by: Zhang, Yiqun, et al.
Published: (2024)
by: Zhang, Yiqun, et al.
Published: (2024)
Emotional Support with LLM-based Empathetic Dialogue Generation
by: Wang, Shiquan, et al.
Published: (2025)
by: Wang, Shiquan, et al.
Published: (2025)
Towards Reliable and Empathetic Depression-Diagnosis-Oriented Chats
by: Lan, Kunyao, et al.
Published: (2024)
by: Lan, Kunyao, et al.
Published: (2024)
Med-U1: Incentivizing Unified Medical Reasoning in LLMs via Large-scale Reinforcement Learning
by: Zhang, Xiaotian, et al.
Published: (2025)
by: Zhang, Xiaotian, et al.
Published: (2025)
STRuCT-LLM: Unifying Tabular and Graph Reasoning with Reinforcement Learning for Semantic Parsing
by: Stoisser, Josefa Lia, et al.
Published: (2025)
by: Stoisser, Josefa Lia, et al.
Published: (2025)
Triplet-Structured Knowledge Integration for Multi-Turn Medical Reasoning
by: Meng, Zhaohan, et al.
Published: (2025)
by: Meng, Zhaohan, et al.
Published: (2025)
Coupled Variational Reinforcement Learning for Language Model General Reasoning
by: Wen, Xueru, et al.
Published: (2025)
by: Wen, Xueru, et al.
Published: (2025)
Similar Items
-
STRIDE-ED: A Strategy-Grounded Stepwise Reasoning Framework for Empathetic Dialogue Systems
by: Ji, Hongru, et al.
Published: (2026) -
Debate, Reflect, and Distill: Multi-Agent Feedback with Tree-Structured Preference Optimization for Efficient Language Model Enhancement
by: Zhou, Xiaofeng, et al.
Published: (2025) -
Aligned Multi-View Scripts for Universal Chart-to-Code Generation
by: Zhang, Zhihan, et al.
Published: (2026) -
Internalizing Outcome Supervision into Process Supervision: A New Paradigm for Reinforcement Learning for Reasoning
by: Ding, Fei, et al.
Published: (2026) -
Kwai-STaR: Transform LLMs into State-Transition Reasoners
by: Lu, Xingyu, et al.
Published: (2024)