Saved in:
| Main Author: | Wu, Xiefeng |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.03341 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
From Reward Shaping to Q-Shaping: Achieving Unbiased Learning with LLM-Guided Knowledge
by: Wu, Xiefeng
Published: (2024)
by: Wu, Xiefeng
Published: (2024)
Teaching RL Agents to Act Better: VLM as Action Advisor for Online Reinforcement Learning
by: Wu, Xiefeng, et al.
Published: (2025)
by: Wu, Xiefeng, et al.
Published: (2025)
Extracting Heuristics from Large Language Models for Reward Shaping in Reinforcement Learning
by: Bhambri, Siddhant, et al.
Published: (2024)
by: Bhambri, Siddhant, et al.
Published: (2024)
A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks
by: Agostinelli, Forest, et al.
Published: (2021)
by: Agostinelli, Forest, et al.
Published: (2021)
Large Language Models as Common-Sense Heuristics
by: Borro, Andrey, et al.
Published: (2025)
by: Borro, Andrey, et al.
Published: (2025)
Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling
by: Wang, Jiaqi, et al.
Published: (2026)
by: Wang, Jiaqi, et al.
Published: (2026)
BASE-Q: Bias and Asymmetric Scaling Enhanced Rotational Quantization for Large Language Models
by: He, Liulu, et al.
Published: (2025)
by: He, Liulu, et al.
Published: (2025)
Neural-Network-Driven Reward Prediction as a Heuristic: Advancing Q-Learning for Mobile Robot Path Planning
by: Ji, Yiming, et al.
Published: (2024)
by: Ji, Yiming, et al.
Published: (2024)
Adaptive Variational Continual Learning via Task-Heuristic Modelling
by: Yang, Fan
Published: (2024)
by: Yang, Fan
Published: (2024)
Deep Reinforcement Learning Guided Improvement Heuristic for Job Shop Scheduling
by: Zhang, Cong, et al.
Published: (2022)
by: Zhang, Cong, et al.
Published: (2022)
Towards Learning Foundation Models for Heuristic Functions to Solve Pathfinding Problems
by: Khandelwal, Vedant, et al.
Published: (2024)
by: Khandelwal, Vedant, et al.
Published: (2024)
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
by: Hong, Joey, et al.
Published: (2024)
by: Hong, Joey, et al.
Published: (2024)
Demystifying the Recency Heuristic in Temporal-Difference Learning
by: Daley, Brett, et al.
Published: (2024)
by: Daley, Brett, et al.
Published: (2024)
Learning Admissible Heuristics for A*: Theory and Practice
by: Futuhi, Ehsan, et al.
Published: (2025)
by: Futuhi, Ehsan, et al.
Published: (2025)
Large Language Models to Enhance Bayesian Optimization
by: Liu, Tennison, et al.
Published: (2024)
by: Liu, Tennison, et al.
Published: (2024)
Reinforcement Learning with Promising Tokens for Large Language Models
by: Pang, Jing-Cheng, et al.
Published: (2026)
by: Pang, Jing-Cheng, et al.
Published: (2026)
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
by: Ji, Kaixuan, et al.
Published: (2024)
by: Ji, Kaixuan, et al.
Published: (2024)
Large Language Model-Enhanced Reinforcement Learning for Generic Bus Holding Control Strategies
by: Yu, Jiajie, et al.
Published: (2024)
by: Yu, Jiajie, et al.
Published: (2024)
Deep Heuristic Learning for Real-Time Urban Pathfinding
by: El-Ela, Mohamed Hussein Abo, et al.
Published: (2024)
by: El-Ela, Mohamed Hussein Abo, et al.
Published: (2024)
Heuristic Transformer: Belief Augmented In-Context Reinforcement Learning
by: Dippel, Oliver, et al.
Published: (2025)
by: Dippel, Oliver, et al.
Published: (2025)
Q-resafe: Assessing Safety Risks and Quantization-aware Safety Patching for Quantized Large Language Models
by: Chen, Kejia, et al.
Published: (2025)
by: Chen, Kejia, et al.
Published: (2025)
PCDVQ: Enhancing Vector Quantization for Large Language Models via Polar Coordinate Decoupling
by: Yue, Yuxuan, et al.
Published: (2025)
by: Yue, Yuxuan, et al.
Published: (2025)
Large Language Model-Enhanced Multi-Armed Bandits
by: Sun, Jiahang, et al.
Published: (2025)
by: Sun, Jiahang, et al.
Published: (2025)
Imagination-Limited Q-Learning for Offline Reinforcement Learning
by: Liu, Wenhui, et al.
Published: (2025)
by: Liu, Wenhui, et al.
Published: (2025)
Enhancing Large Language Models for Time-Series Forecasting via Vector-Injected In-Context Learning
by: Zhang, Jianqi, et al.
Published: (2026)
by: Zhang, Jianqi, et al.
Published: (2026)
Off-Policy Actor-Critic with Sigmoid-Bounded Entropy for Real-World Robot Learning
by: Wu, Xiefeng, et al.
Published: (2026)
by: Wu, Xiefeng, et al.
Published: (2026)
Learning to Condition: A Neural Heuristic for Scalable MPE Inference
by: Malhotra, Brij, et al.
Published: (2025)
by: Malhotra, Brij, et al.
Published: (2025)
Generalizable Heuristic Generation Through LLMs with Meta-Optimization
by: Shi, Yiding, et al.
Published: (2025)
by: Shi, Yiding, et al.
Published: (2025)
Reasoning-Enhanced Large Language Models for Molecular Property Prediction
by: Zhuang, Jiaxi, et al.
Published: (2025)
by: Zhuang, Jiaxi, et al.
Published: (2025)
Hyperbolic Learning with Multimodal Large Language Models
by: Mandica, Paolo, et al.
Published: (2024)
by: Mandica, Paolo, et al.
Published: (2024)
Learning Safety Constraints for Large Language Models
by: Chen, Xin, et al.
Published: (2025)
by: Chen, Xin, et al.
Published: (2025)
Learning Social Heuristics for Human-Aware Path Planning
by: Eirale, Andrea, et al.
Published: (2025)
by: Eirale, Andrea, et al.
Published: (2025)
Vanishing Bias Heuristic-guided Reinforcement Learning Algorithm
by: Li, Qinru, et al.
Published: (2023)
by: Li, Qinru, et al.
Published: (2023)
RLAX: Large-Scale, Distributed Reinforcement Learning for Large Language Models on TPUs
by: Zhou, Runlong, et al.
Published: (2025)
by: Zhou, Runlong, et al.
Published: (2025)
An Unsupervised Learning Framework Combined with Heuristics for the Maximum Minimal Cut Problem
by: Liu, Huaiyuan, et al.
Published: (2024)
by: Liu, Huaiyuan, et al.
Published: (2024)
Reinforcement Learning-based Heuristics to Guide Domain-Independent Dynamic Programming
by: Narita, Minori, et al.
Published: (2025)
by: Narita, Minori, et al.
Published: (2025)
Understanding Sample Generation Strategies for Learning Heuristic Functions in Classical Planning
by: Bettker, R. V., et al.
Published: (2022)
by: Bettker, R. V., et al.
Published: (2022)
Drift Q-Learning
by: Houssaini, Anas, et al.
Published: (2026)
by: Houssaini, Anas, et al.
Published: (2026)
Frictional Q-Learning
by: Kim, Hyunwoo, et al.
Published: (2025)
by: Kim, Hyunwoo, et al.
Published: (2025)
Flow Q-Learning
by: Park, Seohong, et al.
Published: (2025)
by: Park, Seohong, et al.
Published: (2025)
Similar Items
-
From Reward Shaping to Q-Shaping: Achieving Unbiased Learning with LLM-Guided Knowledge
by: Wu, Xiefeng
Published: (2024) -
Teaching RL Agents to Act Better: VLM as Action Advisor for Online Reinforcement Learning
by: Wu, Xiefeng, et al.
Published: (2025) -
Extracting Heuristics from Large Language Models for Reward Shaping in Reinforcement Learning
by: Bhambri, Siddhant, et al.
Published: (2024) -
A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks
by: Agostinelli, Forest, et al.
Published: (2021) -
Large Language Models as Common-Sense Heuristics
by: Borro, Andrey, et al.
Published: (2025)