:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chakraborty, Souradip, Pourreza, Mohammadreza, Sun, Ruoxi, Song, Yiwen, Scherrer, Nino, Huang, Furong, Bedi, Amrit Singh, Beirami, Ahmad, Gu, Jindong, Palangi, Hamid, Pfister, Tomas
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2504.01931
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

RL with Learnable Textual Feedback: A Bilevel Approach
by: Singh, Utsav, et al.
Published: (2026)

PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback
by: Chakraborty, Souradip, et al.
Published: (2023)

HEART: Emotionally-Driven Test-Time Scaling of Language Models
by: Pinto, Gabriela, et al.
Published: (2025)

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment
by: Ghosal, Soumya Suvra, et al.
Published: (2024)

Uncertainty-Aware Answer Selection for Improved Reasoning in Multi-LLM Systems
by: Agrawal, Aakriti, et al.
Published: (2025)

LIAR: Leveraging Inference Time Alignment (Best-of-N) to Jailbreak LLMs in Seconds
by: Beetham, James, et al.
Published: (2024)

Safety Recovery in Reasoning Models Is Only a Few Early Steering Steps Away
by: Ghosal, Soumya Suvra, et al.
Published: (2026)

Why Pass@k Optimization Can Degrade Pass@1: Prompt Interference in LLM Post-training
by: Barakat, Anas, et al.
Published: (2026)

REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback
by: Chakraborty, Souradip, et al.
Published: (2023)

VARP: Reinforcement Learning from Vision-Language Model Feedback with Agent Regularized Preferences
by: Singh, Anukriti, et al.
Published: (2025)

Transfer Q Star: Principled Decoding for LLM Alignment
by: Chakraborty, Souradip, et al.
Published: (2024)

On the Global Optimality of Policy Gradient Methods in General Utility Reinforcement Learning
by: Barakat, Anas, et al.
Published: (2024)

Beyond Text: Utilizing Vocal Cues to Improve Decision Making in LLMs for Robot Navigation Tasks
by: Sun, Xingpeng, et al.
Published: (2024)

Code Comprehension then Auditing for Unsupervised LLM Evaluation
by: Patel, Bhrij, et al.
Published: (2024)

Does Thinking More always Help? Mirage of Test-Time Scaling in Reasoning Models
by: Ghosal, Soumya Suvra, et al.
Published: (2025)

SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging
by: Pourreza, Mohammadreza, et al.
Published: (2024)

MIRA: Towards Mitigating Reward Hacking in Inference-Time Alignment of T2I Diffusion Models
by: Zhai, Kevin, et al.
Published: (2025)

MaxMin-RLHF: Alignment with Diverse Human Preferences
by: Chakraborty, Souradip, et al.
Published: (2024)

SAIL: Self-Improving Efficient Online Alignment of Large Language Models
by: Ding, Mucong, et al.
Published: (2024)

BalancedDPO: Adaptive Multi-Metric Alignment
by: Tamboli, Dipesh, et al.
Published: (2025)

PROPS: Progressively Private Self-alignment of Large Language Models
by: Teku, Noel, et al.
Published: (2025)

ScholarPeer: A Context-Aware Multi-Agent Framework for Automated Peer Review
by: Goyal, Palash, et al.
Published: (2026)

Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment
by: Trivedi, Prashant, et al.
Published: (2025)

FACT or Fiction: Can Truthful Mechanisms Eliminate Federated Free Riding?
by: Bornstein, Marco, et al.
Published: (2024)

DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning
by: Singh, Utsav, et al.
Published: (2024)

Inducing Group Fairness in Prompt-Based Language Model Decisions
by: Atwood, James, et al.
Published: (2024)

Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time
by: Chehade, Mohamad, et al.
Published: (2025)

Watch and Learn: Learning to Use Computers from Online Videos
by: Song, Chan Hee, et al.
Published: (2025)

Agentic Critical Training
by: Liu, Weize, et al.
Published: (2026)

VQQA: An Agentic Approach for Video Evaluation and Quality Improvement
by: Song, Yiwen, et al.
Published: (2026)

Test-Time Scaling in Diffusion LLMs via Hidden Semi-Autoregressive Experts
by: Lee, Jihoon, et al.
Published: (2025)

LLM-Based Multi-Agent Blackboard System for Information Discovery in Data Science
by: Salemi, Alireza, et al.
Published: (2025)

PLAN-TUNING: Post-Training Language Models to Learn Step-by-Step Planning for Complex Problem Solving
by: Parmar, Mihir, et al.
Published: (2025)

AI Cap-and-Trade: Efficiency Incentives for Accessibility and Sustainability
by: Bornstein, Marco, et al.
Published: (2026)

Towards Realistic Mechanisms That Incentivize Federated Participation and Contribution
by: Bornstein, Marco, et al.
Published: (2023)

Rethinking Adversarial Policies: A Generalized Attack Formulation and Provable Defense in RL
by: Liu, Xiangyu, et al.
Published: (2023)

DTS-SQL: Decomposed Text-to-SQL with Small Large Language Models
by: Pourreza, Mohammadreza, et al.
Published: (2024)

Auction-Based Regulation for Artificial Intelligence
by: Bornstein, Marco, et al.
Published: (2024)

On the Vulnerability of LLM/VLM-Controlled Robotics
by: Wu, Xiyang, et al.
Published: (2024)

Mitigating Object Hallucination in MLLMs via Data-augmented Phrase-level Alignment
by: Sarkar, Pritam, et al.
Published: (2024)