Saved in:
| Main Author: | Luo, Zhiling |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.08908 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Mathematics with large language models as provers and verifiers
by: Duc, Hieu Le, et al.
Published: (2025)
by: Duc, Hieu Le, et al.
Published: (2025)
LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover
by: Wu, Zijian, et al.
Published: (2024)
by: Wu, Zijian, et al.
Published: (2024)
Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling
by: Li, Boyang, et al.
Published: (2024)
by: Li, Boyang, et al.
Published: (2024)
Automatic database description generation for Text-to-SQL
by: Gao, Yingqi, et al.
Published: (2025)
by: Gao, Yingqi, et al.
Published: (2025)
Multimodal RAG-driven Anomaly Detection and Classification in Laser Powder Bed Fusion using Large Language Models
by: Khanghah, Kiarash Naghavi, et al.
Published: (2025)
by: Khanghah, Kiarash Naghavi, et al.
Published: (2025)
EDiT: A Local-SGD-Based Efficient Distributed Training Method for Large Language Models
by: Cheng, Jialiang, et al.
Published: (2024)
by: Cheng, Jialiang, et al.
Published: (2024)
Developing an AI Course for Synthetic Chemistry Students
by: Zheng, Zhiling
Published: (2025)
by: Zheng, Zhiling
Published: (2025)
Predicting Scale-Up of Metal-Organic Framework Syntheses with Large Language Models
by: Walther, Peter, et al.
Published: (2026)
by: Walther, Peter, et al.
Published: (2026)
Sketch Then Paint: Hierarchical Reinforcement Learning for Diffusion Multi-Modal Large Language Models
by: Luo, Siqi, et al.
Published: (2026)
by: Luo, Siqi, et al.
Published: (2026)
Vision Language Model for Interpretable and Fine-grained Detection of Safety Compliance in Diverse Workplaces
by: Chen, Zhiling, et al.
Published: (2024)
by: Chen, Zhiling, et al.
Published: (2024)
Scaling Reinforcement Learning for Content Moderation with Large Language Models
by: Firooz, Hamed, et al.
Published: (2025)
by: Firooz, Hamed, et al.
Published: (2025)
In-Context Reinforcement Learning for Tool Use in Large Language Models
by: Ye, Yaoqi, et al.
Published: (2026)
by: Ye, Yaoqi, et al.
Published: (2026)
Large Language Model as a Policy Teacher for Training Reinforcement Learning Agents
by: Zhou, Zihao, et al.
Published: (2023)
by: Zhou, Zihao, et al.
Published: (2023)
A Technical Survey of Reinforcement Learning Techniques for Large Language Models
by: Srivastava, Saksham Sahai, et al.
Published: (2025)
by: Srivastava, Saksham Sahai, et al.
Published: (2025)
On Predictability of Reinforcement Learning Dynamics for Large Language Models
by: Cai, Yuchen, et al.
Published: (2025)
by: Cai, Yuchen, et al.
Published: (2025)
Discovering Reinforcement Learning Interfaces with Large Language Models
by: Jaswal, Akshat Singh, et al.
Published: (2026)
by: Jaswal, Akshat Singh, et al.
Published: (2026)
Rethinking Agentic Reinforcement Learning In Large Language Models
by: Cui, Fangming, et al.
Published: (2026)
by: Cui, Fangming, et al.
Published: (2026)
Reinforcement Learning with Promising Tokens for Large Language Models
by: Pang, Jing-Cheng, et al.
Published: (2026)
by: Pang, Jing-Cheng, et al.
Published: (2026)
Reinforcement Learning Problem Solving with Large Language Models
by: Gholamian, Sina, et al.
Published: (2024)
by: Gholamian, Sina, et al.
Published: (2024)
A Survey of Reinforcement Learning for Large Language Models under Data Scarcity: Challenges and Solutions
by: Yu, Zhiyin, et al.
Published: (2026)
by: Yu, Zhiyin, et al.
Published: (2026)
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
by: Xu, Fengli, et al.
Published: (2025)
by: Xu, Fengli, et al.
Published: (2025)
Large Language Models are Biased Reinforcement Learners
by: Hayes, William M., et al.
Published: (2024)
by: Hayes, William M., et al.
Published: (2024)
Large Language Model-enhanced Reinforcement Learning for Low-Altitude Economy Networking
by: Cai, Lingyi, et al.
Published: (2025)
by: Cai, Lingyi, et al.
Published: (2025)
To Mix or To Merge: Toward Multi-Domain Reinforcement Learning for Large Language Models
by: Wang, Haoqing, et al.
Published: (2026)
by: Wang, Haoqing, et al.
Published: (2026)
RLAX: Large-Scale, Distributed Reinforcement Learning for Large Language Models on TPUs
by: Zhou, Runlong, et al.
Published: (2025)
by: Zhou, Runlong, et al.
Published: (2025)
Contextual Reinforcement in Multimodal Token Compression for Large Language Models
by: Piero, Naderdel, et al.
Published: (2025)
by: Piero, Naderdel, et al.
Published: (2025)
SEM: Reinforcement Learning for Search-Efficient Large Language Models
by: Sha, Zeyang, et al.
Published: (2025)
by: Sha, Zeyang, et al.
Published: (2025)
Efficient Reinforcement Learning for Large Language Models with Intrinsic Exploration
by: Sun, Yan, et al.
Published: (2025)
by: Sun, Yan, et al.
Published: (2025)
Reinforcement Learning Fine-Tunes a Sparse Subnetwork in Large Language Models
by: Balashov, Andrii
Published: (2025)
by: Balashov, Andrii
Published: (2025)
On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models
by: Wang, Shumin, et al.
Published: (2026)
by: Wang, Shumin, et al.
Published: (2026)
Offline Regularised Reinforcement Learning for Large Language Models Alignment
by: Richemond, Pierre Harvey, et al.
Published: (2024)
by: Richemond, Pierre Harvey, et al.
Published: (2024)
Improving the Language Understanding Capabilities of Large Language Models Using Reinforcement Learning
by: Hu, Bokai, et al.
Published: (2024)
by: Hu, Bokai, et al.
Published: (2024)
A formal proof of the Sands-Sauer-Woodrow theorem using the Rocq prover and mathcomp/ssreflect
by: Chancelier, Jean-Philippe
Published: (2026)
by: Chancelier, Jean-Philippe
Published: (2026)
Large Language Model for Patent Concept Generation
by: Ren, Runtao, et al.
Published: (2024)
by: Ren, Runtao, et al.
Published: (2024)
Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning
by: Si, Shuzheng, et al.
Published: (2025)
by: Si, Shuzheng, et al.
Published: (2025)
Semi-supervised Fine-tuning for Large Language Models
by: Luo, Junyu, et al.
Published: (2024)
by: Luo, Junyu, et al.
Published: (2024)
Joint Knowledge Base Completion and Question Answering by Combining Large Language Models and Small Language Models
by: Liu, Yinan, et al.
Published: (2026)
by: Liu, Yinan, et al.
Published: (2026)
Think in Games: Learning to Reason in Games via Reinforcement Learning with Large Language Models
by: Liao, Yi, et al.
Published: (2025)
by: Liao, Yi, et al.
Published: (2025)
Reinforced MLLM: A Survey on RL-Based Reasoning in Multimodal Large Language Models
by: Zhou, Guanghao, et al.
Published: (2025)
by: Zhou, Guanghao, et al.
Published: (2025)
Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation
by: Wang, Ziyan, et al.
Published: (2024)
by: Wang, Ziyan, et al.
Published: (2024)
Similar Items
-
Mathematics with large language models as provers and verifiers
by: Duc, Hieu Le, et al.
Published: (2025) -
LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover
by: Wu, Zijian, et al.
Published: (2024) -
Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling
by: Li, Boyang, et al.
Published: (2024) -
Automatic database description generation for Text-to-SQL
by: Gao, Yingqi, et al.
Published: (2025) -
Multimodal RAG-driven Anomaly Detection and Classification in Laser Powder Bed Fusion using Large Language Models
by: Khanghah, Kiarash Naghavi, et al.
Published: (2025)