Maity, D. (2026). SAFE: Stable Alignment Finetuning with Entropy-Aware Predictive Control for Reinforcement Learning from Human Feedback (RLHF).
Chicago Style (17th ed.) CitationMaity, Dipan. SAFE: Stable Alignment Finetuning with Entropy-Aware Predictive Control for Reinforcement Learning from Human Feedback (RLHF). 2026.
MLA (9th ed.) CitationMaity, Dipan. SAFE: Stable Alignment Finetuning with Entropy-Aware Predictive Control for Reinforcement Learning from Human Feedback (RLHF). 2026.
Warning: These citations may not always be 100% accurate.