Saved in:
| Main Authors: | Barnhart, Logan, Bafghi, Reza Akbarian, Becker, Stephen, Raissi, Maziar |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.09025 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
From Centerlines to Hemodynamics: Anisotropic RBF Decoders for Coronary Arteries
by: Bafghi, Reza Akbarian, et al.
Published: (2026)
by: Bafghi, Reza Akbarian, et al.
Published: (2026)
Test-Driven Agentic Framework for Reliable Robot Controller
by: Tripathi, Shivanshu, et al.
Published: (2026)
by: Tripathi, Shivanshu, et al.
Published: (2026)
MixDiff: Mixing Natural and Synthetic Images for Robust Self-Supervised Representations
by: Bafghi, Reza Akbarian, et al.
Published: (2024)
by: Bafghi, Reza Akbarian, et al.
Published: (2024)
Parameter Efficient Fine-tuning of Self-supervised ViTs without Catastrophic Forgetting
by: Bafghi, Reza Akbarian, et al.
Published: (2024)
by: Bafghi, Reza Akbarian, et al.
Published: (2024)
Fine Tuning without Catastrophic Forgetting via Selective Low Rank Adaptation
by: Bafghi, Reza Akbarian, et al.
Published: (2025)
by: Bafghi, Reza Akbarian, et al.
Published: (2025)
Where Did Your Model Learn That? Label-free Influence for Self-supervised Learning
by: Harilal, Nidhin, et al.
Published: (2024)
by: Harilal, Nidhin, et al.
Published: (2024)
Solving the Inverse Alignment Problem for Efficient RLHF
by: Krishna, Shambhavi, et al.
Published: (2024)
by: Krishna, Shambhavi, et al.
Published: (2024)
Understanding Tool-Augmented Agents for Lean Formalization: A Factorial Analysis
by: Zhang, Ke, et al.
Published: (2026)
by: Zhang, Ke, et al.
Published: (2026)
ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback
by: Hou, Zhenyu, et al.
Published: (2024)
by: Hou, Zhenyu, et al.
Published: (2024)
Why Is RLHF Alignment Shallow? A Gradient Analysis
by: Young, Robin
Published: (2026)
by: Young, Robin
Published: (2026)
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment
by: Zhang, Yi-Fan, et al.
Published: (2025)
by: Zhang, Yi-Fan, et al.
Published: (2025)
More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness
by: Li, Aaron J., et al.
Published: (2024)
by: Li, Aaron J., et al.
Published: (2024)
MaxMin-RLHF: Alignment with Diverse Human Preferences
by: Chakraborty, Souradip, et al.
Published: (2024)
by: Chakraborty, Souradip, et al.
Published: (2024)
SenseAI: A Human-in-the-Loop Dataset for RLHF-Aligned Financial Sentiment Reasoning
by: Kabalisa, Berny
Published: (2026)
by: Kabalisa, Berny
Published: (2026)
Deep LPPLS: Forecasting of temporal critical points in natural, engineering and financial systems
by: Nielsen, Joshua, et al.
Published: (2024)
by: Nielsen, Joshua, et al.
Published: (2024)
PUNCH: Physics-informed Uncertainty-aware Network for Coronary Hemodynamics
by: Thakur, Sukirt, et al.
Published: (2026)
by: Thakur, Sukirt, et al.
Published: (2026)
A Systematic Evaluation of Preference Aggregation in Federated RLHF for Pluralistic Alignment of LLMs
by: Srewa, Mahmoud, et al.
Published: (2025)
by: Srewa, Mahmoud, et al.
Published: (2025)
Culturally Adaptive Explainable LLM Assessment for Multilingual Information Disorder: A Human-in-the-Loop Approach
by: Jouneghani, Maziar Kianimoghadam
Published: (2026)
by: Jouneghani, Maziar Kianimoghadam
Published: (2026)
RLHF in an SFT Way: From Optimal Solution to Reward-Weighted Alignment
by: Du, Yuhao, et al.
Published: (2025)
by: Du, Yuhao, et al.
Published: (2025)
Proxy-RLHF: Decoupling Generation and Alignment in Large Language Model with Proxy
by: Zhu, Yu, et al.
Published: (2024)
by: Zhu, Yu, et al.
Published: (2024)
PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference
by: Ji, Jiaming, et al.
Published: (2024)
by: Ji, Jiaming, et al.
Published: (2024)
Physics-Informed Machine Learning for Smart Additive Manufacturing
by: Sharma, Rahul, et al.
Published: (2024)
by: Sharma, Rahul, et al.
Published: (2024)
Derailing Non-Answers via Logit Suppression at Output Subspace Boundaries in RLHF-Aligned Language Models
by: Dam, Harvey, et al.
Published: (2025)
by: Dam, Harvey, et al.
Published: (2025)
RLHF Workflow: From Reward Modeling to Online RLHF
by: Dong, Hanze, et al.
Published: (2024)
by: Dong, Hanze, et al.
Published: (2024)
RLHF: A comprehensive Survey for Cultural, Multimodal and Low Latency Alignment Methods
by: Sharma, Raghav, et al.
Published: (2025)
by: Sharma, Raghav, et al.
Published: (2025)
ELPINN: Eulerian Lagrangian Physics-Informed Neural Network
by: Thakur, Sukirt, et al.
Published: (2025)
by: Thakur, Sukirt, et al.
Published: (2025)
Online Optimization with Unknown Time-Varying Parameters from Noisy Gradient Measurements
by: Tripathi, Shivanshu, et al.
Published: (2026)
by: Tripathi, Shivanshu, et al.
Published: (2026)
From RLHF to Direct Alignment: A Theoretical Unification of Preference Learning for Large Language Models
by: Raheja, Tarun, et al.
Published: (2026)
by: Raheja, Tarun, et al.
Published: (2026)
MKJ at SemEval-2026 Task 9: A Comparative Study of Generalist, Specialist, and Ensemble Strategies for Multilingual Polarization
by: Jouneghani, Maziar Kianimoghadam
Published: (2026)
by: Jouneghani, Maziar Kianimoghadam
Published: (2026)
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
by: Hu, Jian, et al.
Published: (2024)
by: Hu, Jian, et al.
Published: (2024)
Balanced Actor Initialization: Stable RLHF Training of Distillation-Based Reasoning Models
by: Zheng, Chen, et al.
Published: (2025)
by: Zheng, Chen, et al.
Published: (2025)
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
by: Yu, Tianyu, et al.
Published: (2023)
by: Yu, Tianyu, et al.
Published: (2023)
Learning Parameterized Nonlinear Elasticity on Curved Surfaces
by: Liu, Yankang, et al.
Published: (2026)
by: Liu, Yankang, et al.
Published: (2026)
AT-RAG: An Adaptive RAG Model Enhancing Query Efficiency with Topic Filtering and Iterative Reasoning
by: Rezaei, Mohammad Reza, et al.
Published: (2024)
by: Rezaei, Mohammad Reza, et al.
Published: (2024)
Taming Overconfidence in LLMs: Reward Calibration in RLHF
by: Leng, Jixuan, et al.
Published: (2024)
by: Leng, Jixuan, et al.
Published: (2024)
Failure Modes of Maximum Entropy RLHF
by: Çağatan, Ömer Veysel, et al.
Published: (2025)
by: Çağatan, Ömer Veysel, et al.
Published: (2025)
Language Models Learn to Mislead Humans via RLHF
by: Wen, Jiaxin, et al.
Published: (2024)
by: Wen, Jiaxin, et al.
Published: (2024)
RLHF and IIA: Perverse Incentives
by: Xu, Wanqiao, et al.
Published: (2023)
by: Xu, Wanqiao, et al.
Published: (2023)
Reward-Robust RLHF in LLMs
by: Yan, Yuzi, et al.
Published: (2024)
by: Yan, Yuzi, et al.
Published: (2024)
FormalAlign: Automated Alignment Evaluation for Autoformalization
by: Lu, Jianqiao, et al.
Published: (2024)
by: Lu, Jianqiao, et al.
Published: (2024)
Similar Items
-
From Centerlines to Hemodynamics: Anisotropic RBF Decoders for Coronary Arteries
by: Bafghi, Reza Akbarian, et al.
Published: (2026) -
Test-Driven Agentic Framework for Reliable Robot Controller
by: Tripathi, Shivanshu, et al.
Published: (2026) -
MixDiff: Mixing Natural and Synthetic Images for Robust Self-Supervised Representations
by: Bafghi, Reza Akbarian, et al.
Published: (2024) -
Parameter Efficient Fine-tuning of Self-supervised ViTs without Catastrophic Forgetting
by: Bafghi, Reza Akbarian, et al.
Published: (2024) -
Fine Tuning without Catastrophic Forgetting via Selective Low Rank Adaptation
by: Bafghi, Reza Akbarian, et al.
Published: (2025)