:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Feng, Zihao, Wang, Xiaoxue, Wu, Bowen, Cao, Hailong, Zhao, Tiejun, Yu, Qun, Wang, Baoxun
Formato:	Preprint
Publicado:	2025
Materias:	Machine Learning Computation and Language
Acceso en línea:	https://arxiv.org/abs/2509.14718
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Improving Generalization in Intent Detection: GRPO with Reward-Based Curriculum Sampling
por: Feng, Zihao, et al.
Publicado: (2025)

Empowering LLMs in Task-Oriented Dialogues: A Domain-Independent Multi-Agent Framework and Fine-Tuning Strategy
por: Feng, Zihao, et al.
Publicado: (2025)

ToolRL: Reward is All Tool Learning Needs
por: Qian, Cheng, et al.
Publicado: (2025)

AdaCuRL: Adaptive Curriculum Reinforcement Learning with Invalid Sample Mitigation and Historical Revisiting
por: Li, Renda, et al.
Publicado: (2025)

DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training
por: Wang, Zhenting, et al.
Publicado: (2025)

RAIDEN-R1: Improving Role-awareness of LLMs via GRPO with Verifiable Reward
por: Wang, Zongsheng, et al.
Publicado: (2025)

ToolACE-R: Model-aware Iterative Training and Adaptive Refinement for Tool Learning
por: Zeng, Xingshan, et al.
Publicado: (2025)

ResRL: Boosting LLM Reasoning via Negative Sample Projection Residual Reinforcement Learning
por: Lin, Zihan, et al.
Publicado: (2026)

Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation
por: Zhu, Dongsheng, et al.
Publicado: (2025)

Learning to Reason as Action Abstractions with Scalable Mid-Training RL
por: Zhang, Shenao, et al.
Publicado: (2025)

Enhancing Large Language Models'Machine Translation via Dynamic Focus Anchoring
por: Ding, Qiuyu, et al.
Publicado: (2025)

Tool Learning with Foundation Models
por: Qin, Yujia, et al.
Publicado: (2023)

Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design
por: Zhang, Yudi, et al.
Publicado: (2025)

ToolExpander: Extending the Frontiers of Tool-Using Reinforcement Learning to Weak LLMs
por: Chen, Fu, et al.
Publicado: (2025)

AutoTool: Dynamic Tool Selection and Integration for Agentic Reasoning
por: Zou, Jiaru, et al.
Publicado: (2025)

LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error
por: Wang, Boshi, et al.
Publicado: (2024)

iTool: Reinforced Fine-Tuning with Dynamic Deficiency Calibration for Advanced Tool Use
por: Zeng, Yirong, et al.
Publicado: (2025)

MetaTool: Facilitating Large Language Models to Master Tools with Meta-task Augmentation
por: Wang, Xiaohan, et al.
Publicado: (2024)

Learning Harmonized Representations for Speculative Sampling
por: Zhang, Lefan, et al.
Publicado: (2024)

DISA: Offline Importance Sampling for Distribution-Matching LLM-RL
por: Wang, Shaobo, et al.
Publicado: (2026)

Cross-Domain Bilingual Lexicon Induction via Pretrained Language Models
por: Ding, Qiuyu, et al.
Publicado: (2025)

VCRL: Variance-based Curriculum Reinforcement Learning for Large Language Models
por: Jiang, Guochao, et al.
Publicado: (2025)

Interpersonal Memory Matters: A New Task for Proactive Dialogue Utilizing Conversational History
por: Wu, Bowen, et al.
Publicado: (2025)

ToolACE: Winning the Points of LLM Function Calling
por: Liu, Weiwen, et al.
Publicado: (2024)

CL4KGE: A Curriculum Learning Method for Knowledge Graph Embedding
por: Liu, Yang, et al.
Publicado: (2024)

Are Tools Always Beneficial? Learning to Invoke Tools Adaptively for Dual-Mode Multimodal LLM Reasoning
por: Ma, Qinghe, et al.
Publicado: (2026)

Incentivizing Agentic Reasoning in LLM Judges via Tool-Integrated Reinforcement Learning
por: Xu, Ran, et al.
Publicado: (2025)

Mirage or Method? How Model-Task Alignment Induces Divergent RL Conclusions
por: Wu, Haoze, et al.
Publicado: (2025)

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
por: Xu, Minrui, et al.
Publicado: (2026)

Reinforcement Learning for Tool-Integrated Interleaved Thinking towards Cross-Domain Generalization
por: Chen, Zhengyu, et al.
Publicado: (2025)

OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning
por: Lu, Pan, et al.
Publicado: (2025)

ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction
por: Zeng, Xingshan, et al.
Publicado: (2025)

Generalizable End-to-End Tool-Use RL with Synthetic CodeGym
por: Du, Weihua, et al.
Publicado: (2025)

Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use
por: Goldie, Anna, et al.
Publicado: (2025)

When Sharpening Becomes Collapse: Sampling Bias and Semantic Coupling in RL with Verifiable Rewards
por: Fan, Mingyuan, et al.
Publicado: (2026)

PromptAL: Sample-Aware Dynamic Soft Prompts for Few-Shot Active Learning
por: Xiang, Hui, et al.
Publicado: (2025)

Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning
por: Dong, Guanting, et al.
Publicado: (2025)

Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe
por: Wu, Xixi, et al.
Publicado: (2026)

ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
por: Hao, Shibo, et al.
Publicado: (2023)

Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
por: Shi, Taiwei, et al.
Publicado: (2025)