:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Khattar, Vanshaj, Ding, Yuhao, Sel, Bilgehan, Lavaei, Javad, Jin, Ming
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2405.16601
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning
by: Khattar, Vanshaj, et al.
Published: (2024)

Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning
by: Gu, Shangding, et al.
Published: (2024)

Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation
by: Gu, Shangding, et al.
Published: (2024)

Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction
by: Ying, Donghao, et al.
Published: (2022)

Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
by: Sel, Bilgehan, et al.
Published: (2023)

Reinforcement Learning with Backtracking Feedback
by: Sel, Bilgehan, et al.
Published: (2026)

Pausing Policy Learning in Non-stationary Reinforcement Learning
by: Lee, Hyunin, et al.
Published: (2024)

Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
by: Ding, Yuhao, et al.
Published: (2021)

Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities
by: Khattar, Vanshaj, et al.
Published: (2026)

Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs
by: Sel, Bilgehan, et al.
Published: (2024)

Detecting Zero-Day Attacks in Digital Substations via In-Context Learning
by: Manzoor, Faizan, et al.
Published: (2025)

Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation
by: Gu, Shangding, et al.
Published: (2024)

Provably Efficient Sample Complexity for Robust CMDP
by: Ganguly, Sourav, et al.
Published: (2025)

StyleBench: Evaluating thinking styles in Large Language Models
by: Guo, Junyu, et al.
Published: (2025)

LLMs Should Express Uncertainty Explicitly
by: Guo, Junyu, et al.
Published: (2026)

LLMs Can Plan Only If We Tell Them
by: Sel, Bilgehan, et al.
Published: (2025)

TRSVR: An Adaptive Stochastic Trust-Region Method with Variance Reduction
by: Fang, Yuchen, et al.
Published: (2026)

Why is Normalization Preferred? A Worst-Case Complexity Theory for Stochastically Preconditioned SGD under Heavy-Tailed Noise
by: Fang, Yuchen, et al.
Published: (2026)

Don't Trade Off Safety: Diffusion Regularization for Constrained Offline RL
by: Guo, Junyu, et al.
Published: (2025)

High Probability Complexity Bounds of Trust-Region Stochastic Sequential Quadratic Programming with Heavy-Tailed Noise
by: Fang, Yuchen, et al.
Published: (2025)

Few-Shot Test-Time Optimization Without Retraining for Semiconductor Recipe Generation and Beyond
by: Gu, Shangding, et al.
Published: (2025)

Absence of spurious solutions far from ground truth: A low-rank analysis with high-order losses
by: Ma, Ziye, et al.
Published: (2024)

Subgradient Method for System Identification with Non-Smooth Objectives
by: Yalcin, Baturalp, et al.
Published: (2025)

Exact Recovery for System Identification with More Corrupt Data than Clean Data
by: Yalcin, Baturalp, et al.
Published: (2023)

Feasibility Consistent Representation Learning for Safe Reinforcement Learning
by: Cen, Zhepeng, et al.
Published: (2024)

A Trust-Region Interior-Point Stochastic Sequential Quadratic Programming Method
by: Fang, Yuchen, et al.
Published: (2026)

Structural Correspondence and Universal Approximation in Diagonal plus Low-Rank Neural Networks
by: Chen, Ying, et al.
Published: (2026)

OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning
by: Yao, Yihang, et al.
Published: (2024)

Controlling Underestimation Bias in Constrained Reinforcement Learning for Safe Exploration
by: Gao, Shiqing, et al.
Published: (2026)

Extreme Value Policy Optimization for Safe Reinforcement Learning
by: Gao, Shiqing, et al.
Published: (2026)

Exact Recovery Guarantees for Parameterized Nonlinear System Identification Problem under Sparse Disturbances or Semi-Oblivious Attacks
by: Zhang, Haixiang, et al.
Published: (2024)

Meta-model Neural Process for Probabilistic Power Flow under Varying N-1 System Topologies
by: Ly, Sel, et al.
Published: (2025)

Intersection of Reinforcement Learning and Bayesian Optimization for Intelligent Control of Industrial Processes: A Safe MPC-based DPG using Multi-Objective BO
by: Esfahani, Hossein Nejatbakhsh, et al.
Published: (2025)

A Scalable Approach for Safe and Robust Learning via Lipschitz-Constrained Networks
by: Abdeen, Zain ul, et al.
Published: (2025)

Safe In-Context Reinforcement Learning
by: Moeini, Amir, et al.
Published: (2025)

Counterfactually Safe Reinforcement Learning
by: Li, Jingyi, et al.
Published: (2026)

Online Bayesian Risk-Averse Reinforcement Learning
by: Wang, Yuhao, et al.
Published: (2025)

Trojan-Speak: Bypassing Constitutional Classifiers with No Jailbreak Tax via Adversarial Finetuning
by: Sel, Bilgehan, et al.
Published: (2026)

Policy Bifurcation in Safe Reinforcement Learning
by: Zou, Wenjun, et al.
Published: (2024)

Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning
by: Yao, Yihang, et al.
Published: (2023)