Saved in:
| Main Authors: | Khattar, Vanshaj, Ding, Yuhao, Sel, Bilgehan, Lavaei, Javad, Jin, Ming |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.16601 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning
by: Khattar, Vanshaj, et al.
Published: (2024)
by: Khattar, Vanshaj, et al.
Published: (2024)
Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning
by: Gu, Shangding, et al.
Published: (2024)
by: Gu, Shangding, et al.
Published: (2024)
Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation
by: Gu, Shangding, et al.
Published: (2024)
by: Gu, Shangding, et al.
Published: (2024)
Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction
by: Ying, Donghao, et al.
Published: (2022)
by: Ying, Donghao, et al.
Published: (2022)
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
by: Sel, Bilgehan, et al.
Published: (2023)
by: Sel, Bilgehan, et al.
Published: (2023)
Reinforcement Learning with Backtracking Feedback
by: Sel, Bilgehan, et al.
Published: (2026)
by: Sel, Bilgehan, et al.
Published: (2026)
Pausing Policy Learning in Non-stationary Reinforcement Learning
by: Lee, Hyunin, et al.
Published: (2024)
by: Lee, Hyunin, et al.
Published: (2024)
Beyond Exact Gradients: Convergence of Stochastic Soft-Max Policy Gradient Methods with Entropy Regularization
by: Ding, Yuhao, et al.
Published: (2021)
by: Ding, Yuhao, et al.
Published: (2021)
Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities
by: Khattar, Vanshaj, et al.
Published: (2026)
by: Khattar, Vanshaj, et al.
Published: (2026)
Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs
by: Sel, Bilgehan, et al.
Published: (2024)
by: Sel, Bilgehan, et al.
Published: (2024)
Detecting Zero-Day Attacks in Digital Substations via In-Context Learning
by: Manzoor, Faizan, et al.
Published: (2025)
by: Manzoor, Faizan, et al.
Published: (2025)
Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation
by: Gu, Shangding, et al.
Published: (2024)
by: Gu, Shangding, et al.
Published: (2024)
Provably Efficient Sample Complexity for Robust CMDP
by: Ganguly, Sourav, et al.
Published: (2025)
by: Ganguly, Sourav, et al.
Published: (2025)
StyleBench: Evaluating thinking styles in Large Language Models
by: Guo, Junyu, et al.
Published: (2025)
by: Guo, Junyu, et al.
Published: (2025)
LLMs Should Express Uncertainty Explicitly
by: Guo, Junyu, et al.
Published: (2026)
by: Guo, Junyu, et al.
Published: (2026)
LLMs Can Plan Only If We Tell Them
by: Sel, Bilgehan, et al.
Published: (2025)
by: Sel, Bilgehan, et al.
Published: (2025)
TRSVR: An Adaptive Stochastic Trust-Region Method with Variance Reduction
by: Fang, Yuchen, et al.
Published: (2026)
by: Fang, Yuchen, et al.
Published: (2026)
Why is Normalization Preferred? A Worst-Case Complexity Theory for Stochastically Preconditioned SGD under Heavy-Tailed Noise
by: Fang, Yuchen, et al.
Published: (2026)
by: Fang, Yuchen, et al.
Published: (2026)
Don't Trade Off Safety: Diffusion Regularization for Constrained Offline RL
by: Guo, Junyu, et al.
Published: (2025)
by: Guo, Junyu, et al.
Published: (2025)
High Probability Complexity Bounds of Trust-Region Stochastic Sequential Quadratic Programming with Heavy-Tailed Noise
by: Fang, Yuchen, et al.
Published: (2025)
by: Fang, Yuchen, et al.
Published: (2025)
Few-Shot Test-Time Optimization Without Retraining for Semiconductor Recipe Generation and Beyond
by: Gu, Shangding, et al.
Published: (2025)
by: Gu, Shangding, et al.
Published: (2025)
Absence of spurious solutions far from ground truth: A low-rank analysis with high-order losses
by: Ma, Ziye, et al.
Published: (2024)
by: Ma, Ziye, et al.
Published: (2024)
Subgradient Method for System Identification with Non-Smooth Objectives
by: Yalcin, Baturalp, et al.
Published: (2025)
by: Yalcin, Baturalp, et al.
Published: (2025)
Exact Recovery for System Identification with More Corrupt Data than Clean Data
by: Yalcin, Baturalp, et al.
Published: (2023)
by: Yalcin, Baturalp, et al.
Published: (2023)
Feasibility Consistent Representation Learning for Safe Reinforcement Learning
by: Cen, Zhepeng, et al.
Published: (2024)
by: Cen, Zhepeng, et al.
Published: (2024)
A Trust-Region Interior-Point Stochastic Sequential Quadratic Programming Method
by: Fang, Yuchen, et al.
Published: (2026)
by: Fang, Yuchen, et al.
Published: (2026)
Structural Correspondence and Universal Approximation in Diagonal plus Low-Rank Neural Networks
by: Chen, Ying, et al.
Published: (2026)
by: Chen, Ying, et al.
Published: (2026)
OASIS: Conditional Distribution Shaping for Offline Safe Reinforcement Learning
by: Yao, Yihang, et al.
Published: (2024)
by: Yao, Yihang, et al.
Published: (2024)
Controlling Underestimation Bias in Constrained Reinforcement Learning for Safe Exploration
by: Gao, Shiqing, et al.
Published: (2026)
by: Gao, Shiqing, et al.
Published: (2026)
Extreme Value Policy Optimization for Safe Reinforcement Learning
by: Gao, Shiqing, et al.
Published: (2026)
by: Gao, Shiqing, et al.
Published: (2026)
Exact Recovery Guarantees for Parameterized Nonlinear System Identification Problem under Sparse Disturbances or Semi-Oblivious Attacks
by: Zhang, Haixiang, et al.
Published: (2024)
by: Zhang, Haixiang, et al.
Published: (2024)
Meta-model Neural Process for Probabilistic Power Flow under Varying N-1 System Topologies
by: Ly, Sel, et al.
Published: (2025)
by: Ly, Sel, et al.
Published: (2025)
Intersection of Reinforcement Learning and Bayesian Optimization for Intelligent Control of Industrial Processes: A Safe MPC-based DPG using Multi-Objective BO
by: Esfahani, Hossein Nejatbakhsh, et al.
Published: (2025)
by: Esfahani, Hossein Nejatbakhsh, et al.
Published: (2025)
A Scalable Approach for Safe and Robust Learning via Lipschitz-Constrained Networks
by: Abdeen, Zain ul, et al.
Published: (2025)
by: Abdeen, Zain ul, et al.
Published: (2025)
Safe In-Context Reinforcement Learning
by: Moeini, Amir, et al.
Published: (2025)
by: Moeini, Amir, et al.
Published: (2025)
Counterfactually Safe Reinforcement Learning
by: Li, Jingyi, et al.
Published: (2026)
by: Li, Jingyi, et al.
Published: (2026)
Online Bayesian Risk-Averse Reinforcement Learning
by: Wang, Yuhao, et al.
Published: (2025)
by: Wang, Yuhao, et al.
Published: (2025)
Trojan-Speak: Bypassing Constitutional Classifiers with No Jailbreak Tax via Adversarial Finetuning
by: Sel, Bilgehan, et al.
Published: (2026)
by: Sel, Bilgehan, et al.
Published: (2026)
Policy Bifurcation in Safe Reinforcement Learning
by: Zou, Wenjun, et al.
Published: (2024)
by: Zou, Wenjun, et al.
Published: (2024)
Constraint-Conditioned Policy Optimization for Versatile Safe Reinforcement Learning
by: Yao, Yihang, et al.
Published: (2023)
by: Yao, Yihang, et al.
Published: (2023)
Similar Items
-
Optimization Solution Functions as Deterministic Policies for Offline Reinforcement Learning
by: Khattar, Vanshaj, et al.
Published: (2024) -
Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning
by: Gu, Shangding, et al.
Published: (2024) -
Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation
by: Gu, Shangding, et al.
Published: (2024) -
Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction
by: Ying, Donghao, et al.
Published: (2022) -
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
by: Sel, Bilgehan, et al.
Published: (2023)