Saved in:
| Main Authors: | Niu, Yifan, Xiao, Han, Liu, Dongyi, Chen, Nuo, Li, Jia |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.11391 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DHP: Efficient Scaling of MLLM Training with Dynamic Hybrid Parallelism
by: Niu, Yifan, et al.
Published: (2026)
by: Niu, Yifan, et al.
Published: (2026)
Reducing the Safety Tax in LLM Safety Alignment with On-Policy Self-Distillation
by: Fu, Yu, et al.
Published: (2026)
by: Fu, Yu, et al.
Published: (2026)
Mitigating the Alignment Tax of RLHF
by: Lin, Yong, et al.
Published: (2023)
by: Lin, Yong, et al.
Published: (2023)
Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection
by: Sun, Guanglong, et al.
Published: (2026)
by: Sun, Guanglong, et al.
Published: (2026)
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
by: Lu, Keming, et al.
Published: (2024)
by: Lu, Keming, et al.
Published: (2024)
State-wise Constrained Policy Optimization
by: Zhao, Weiye, et al.
Published: (2023)
by: Zhao, Weiye, et al.
Published: (2023)
Refine and Purify: Orthogonal Basis Optimization with Null-Space Denoising for Conditional Representation Learning
by: Wang, Jiaquan, et al.
Published: (2026)
by: Wang, Jiaquan, et al.
Published: (2026)
Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable
by: Huang, Tiansheng, et al.
Published: (2025)
by: Huang, Tiansheng, et al.
Published: (2025)
AlignIQL: Policy Alignment in Implicit Q-Learning through Constrained Optimization
by: He, Longxiang, et al.
Published: (2024)
by: He, Longxiang, et al.
Published: (2024)
Null-Space Flow Matching for MIMO Channel Estimation in Latency-Constrained Systems
by: Zhao, Junjie, et al.
Published: (2026)
by: Zhao, Junjie, et al.
Published: (2026)
Absolute State-wise Constrained Policy Optimization: High-Probability State-wise Constraints Satisfaction
by: Zhao, Weiye, et al.
Published: (2024)
by: Zhao, Weiye, et al.
Published: (2024)
Teleportation With Null Space Gradient Projection for Optimization Acceleration
by: Wu, Zihao, et al.
Published: (2025)
by: Wu, Zihao, et al.
Published: (2025)
Stepwise Alignment for Constrained Language Model Policy Optimization
by: Wachi, Akifumi, et al.
Published: (2024)
by: Wachi, Akifumi, et al.
Published: (2024)
Safety Game: Inference-Time Alignment of Black-Box LLMs via Constrained Optimization
by: Nguyen, Tuan, et al.
Published: (2025)
by: Nguyen, Tuan, et al.
Published: (2025)
Cross-Paradigm Graph Backdoor Attacks with Promptable Subgraph Triggers
by: Liu, Dongyi, et al.
Published: (2025)
by: Liu, Dongyi, et al.
Published: (2025)
Paying Alignment Tax with Contrastive Learning
by: Korkmaz, Buse Sibel, et al.
Published: (2025)
by: Korkmaz, Buse Sibel, et al.
Published: (2025)
Boost Post-Training Quantization via Null Space Optimization for Large Language Models
by: Zhao, Jiaqi, et al.
Published: (2025)
by: Zhao, Jiaqi, et al.
Published: (2025)
Constrained Policy Optimization via Sampling-Based Weight-Space Projection
by: Cao, Shengfan, et al.
Published: (2025)
by: Cao, Shengfan, et al.
Published: (2025)
What Is the Alignment Tax?
by: Young, Robin
Published: (2026)
by: Young, Robin
Published: (2026)
GNSP: Gradient Null Space Projection for Preserving Cross-Modal Alignment in VLMs Continual Learning
by: Peng, Tiantian, et al.
Published: (2025)
by: Peng, Tiantian, et al.
Published: (2025)
Adaptive Test-Time Compute Allocation for Reasoning LLMs via Constrained Policy Optimization
by: Zhai, Zhiyuan, et al.
Published: (2026)
by: Zhai, Zhiyuan, et al.
Published: (2026)
Absolute Policy Optimization
by: Zhao, Weiye, et al.
Published: (2023)
by: Zhao, Weiye, et al.
Published: (2023)
NPAT Null-Space Projected Adversarial Training Towards Zero Deterioration
by: Hu, Hanyi, et al.
Published: (2024)
by: Hu, Hanyi, et al.
Published: (2024)
Machine Unlearning via Null Space Calibration
by: Chen, Huiqiang, et al.
Published: (2024)
by: Chen, Huiqiang, et al.
Published: (2024)
Proactive Constrained Policy Optimization with Preemptive Penalty
by: Yang, Ning, et al.
Published: (2025)
by: Yang, Ning, et al.
Published: (2025)
OSNIP: Breaking the Privacy-Utility-Efficiency Trilemma in LLM Inference via Obfuscated Semantic Null Space
by: Cao, Zhiyuan, et al.
Published: (2026)
by: Cao, Zhiyuan, et al.
Published: (2026)
Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models
by: Liu, Qin, et al.
Published: (2024)
by: Liu, Qin, et al.
Published: (2024)
EvoEdit: Evolving Null-space Alignment for Robust and Efficient Knowledge Editing
by: Lyu, Sicheng, et al.
Published: (2025)
by: Lyu, Sicheng, et al.
Published: (2025)
Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning
by: Chen, Liang, et al.
Published: (2025)
by: Chen, Liang, et al.
Published: (2025)
Coloring Between the Lines: Personalization in the Null Space of Planning Constraints
by: Silver, Tom, et al.
Published: (2025)
by: Silver, Tom, et al.
Published: (2025)
Provable Robust Overfitting Mitigation in Wasserstein Distributionally Robust Optimization
by: Liu, Shuang, et al.
Published: (2025)
by: Liu, Shuang, et al.
Published: (2025)
Autoregressive Policy Optimization for Constrained Allocation Tasks
by: Winkel, David, et al.
Published: (2024)
by: Winkel, David, et al.
Published: (2024)
Enhancing LLM Safety via Constrained Direct Preference Optimization
by: Liu, Zixuan, et al.
Published: (2024)
by: Liu, Zixuan, et al.
Published: (2024)
Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning
by: Koirala, Prajwal, et al.
Published: (2024)
by: Koirala, Prajwal, et al.
Published: (2024)
e-COP : Episodic Constrained Optimization of Policies
by: Agnihotri, Akhil, et al.
Published: (2024)
by: Agnihotri, Akhil, et al.
Published: (2024)
Guardrails in Logit Space: Safety Token Regularization for LLM Alignment
by: Bach, Thong, et al.
Published: (2026)
by: Bach, Thong, et al.
Published: (2026)
InversionGNN: A Dual Path Network for Multi-Property Molecular Optimization
by: Niu, Yifan, et al.
Published: (2025)
by: Niu, Yifan, et al.
Published: (2025)
Advantage Collapse in Group Relative Policy Optimization: Diagnosis and Mitigation
by: He, Xixiang, et al.
Published: (2026)
by: He, Xixiang, et al.
Published: (2026)
Constrained Policy Optimization with Cantelli-Bounded Value-at-Risk
by: Tangri, Rohan, et al.
Published: (2026)
by: Tangri, Rohan, et al.
Published: (2026)
Few Tokens, Big Leverage: Preserving Safety Alignment by Constraining Safety Tokens during Fine-tuning
by: Wang, Guoli, et al.
Published: (2026)
by: Wang, Guoli, et al.
Published: (2026)
Similar Items
-
DHP: Efficient Scaling of MLLM Training with Dynamic Hybrid Parallelism
by: Niu, Yifan, et al.
Published: (2026) -
Reducing the Safety Tax in LLM Safety Alignment with On-Policy Self-Distillation
by: Fu, Yu, et al.
Published: (2026) -
Mitigating the Alignment Tax of RLHF
by: Lin, Yong, et al.
Published: (2023) -
Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection
by: Sun, Guanglong, et al.
Published: (2026) -
Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
by: Lu, Keming, et al.
Published: (2024)