Zhang, Y. (2025). ARF-RLHF: Adaptive Reward-Following for RLHF through Emotion-Driven Self-Supervision and Trace-Biased Dynamic Optimization.
Citação norma ChicagoZhang, YuXuan. ARF-RLHF: Adaptive Reward-Following for RLHF Through Emotion-Driven Self-Supervision and Trace-Biased Dynamic Optimization. 2025.
Citação norma MLAZhang, YuXuan. ARF-RLHF: Adaptive Reward-Following for RLHF Through Emotion-Driven Self-Supervision and Trace-Biased Dynamic Optimization. 2025.
Nota: a formatação da citação pode não corresponder 100% ao definido pela respectiva norma.