:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xie, Yingsha, Huang, Tiansheng, Yang, Enneng, Min, Rui, Lu, Wenjie, Cao, Xiaochun, Tan, Naiqiang, Shen, Li
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.02136
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
by: Wang, Yibo, et al.
Published: (2025)

Safety Tax: Safety Alignment Makes Your Large Reasoning Models Less Reasonable
by: Huang, Tiansheng, et al.
Published: (2025)

R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search
by: Wang, Yibo, et al.
Published: (2025)

Ada-R1: Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization
by: Luo, Haotian, et al.
Published: (2025)

O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
by: Luo, Haotian, et al.
Published: (2025)

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities
by: Yang, Enneng, et al.
Published: (2024)

OptMerge: Unifying Multimodal LLM Capabilities and Modalities via Model Merging
by: Wei, Yongxian, et al.
Published: (2025)

Surgery: Mitigating Harmful Fine-Tuning for Large Language Models via Attention Sink
by: Liu, Guozhi, et al.
Published: (2026)

Targeted Vaccine: Safety Alignment for Large Language Models against Harmful Fine-Tuning via Layer-wise Perturbation
by: Liu, Guozhi, et al.
Published: (2024)

Bag of Tricks for Inference-time Computation of LLM Reasoning
by: Liu, Fan, et al.
Published: (2025)

Not All Thoughts are Generated Equal: Efficient LLM Reasoning via Multi-Turn Reinforcement Learning
by: Ning, Yansong, et al.
Published: (2025)

SafeThinker: Reasoning about Risk to Deepen Safety Beyond Shallow Alignment
by: Fang, Xianya, et al.
Published: (2026)

Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging
by: Shen, Li, et al.
Published: (2024)

Model Unmerging: Making Your Models Unmergeable for Secure Model Sharing
by: Wang, Zihao, et al.
Published: (2025)

Coarse-to-Fine Proposal Refinement Framework for Audio Temporal Forgery Detection and Localization
by: Wu, Junyan, et al.
Published: (2024)

UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios
by: Luo, Haotian, et al.
Published: (2025)

Distributionally Robust Graph Out-of-Distribution Recommendation via Diffusion Model
by: Zhao, Chu, et al.
Published: (2025)

Auto-Search and Refinement: An Automated Framework for Gender Bias Mitigation in Large Language Models
by: Xu, Yue, et al.
Published: (2025)

When Models Outthink Their Safety: Unveiling and Mitigating Self-Jailbreak in Large Reasoning Models
by: Mao, Yingzhi, et al.
Published: (2025)

Hard Negative Sampling via Large Language Models for Recommendation
by: Zhao, Chu, et al.
Published: (2025)

Pharmacist: Safety Alignment Data Curation for Large Language Models against Harmful Fine-tuning
by: Liu, Guozhi, et al.
Published: (2025)

A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning
by: Wang, Zhenyi, et al.
Published: (2023)

Compress the Easy, Explore the Hard: Difficulty-Aware Entropy Regularization for Efficient LLM Reasoning
by: Luo, Qin-Wen, et al.
Published: (2026)

Beyond the Safety Tax: Mitigating Unsafe Text-to-Image Generation via External Safety Rectification
by: Meng, Xiangtao, et al.
Published: (2025)

Antidote: Post-fine-tuning Safety Alignment for Large Language Models against Harmful Fine-tuning
by: Huang, Tiansheng, et al.
Published: (2024)

Think in Safety: Unveiling and Mitigating Safety Alignment Collapse in Multimodal Large Reasoning Model
by: Lou, Xinyue, et al.
Published: (2025)

Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack
by: Huang, Tiansheng, et al.
Published: (2024)

MeasHalu: Mitigation of Scientific Measurement Hallucinations for Large Language Models with Enhanced Reasoning
by: Huang, Ruijun, et al.
Published: (2026)

Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
by: Lu, Keming, et al.
Published: (2024)

Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement
by: Xi, Zhiheng, et al.
Published: (2023)

Long Is More Important Than Difficult for Training Reasoning Models
by: Shen, Si, et al.
Published: (2025)

Kestrel: Grounding Self-Refinement for LVLM Hallucination Mitigation
by: Mao, Jiawei, et al.
Published: (2026)

Rebellion: Noise-Robust Reasoning Training for Audio Reasoning Models
by: Huang, Tiansheng, et al.
Published: (2025)

Safety Alignment as Continual Learning: Mitigating the Alignment Tax via Orthogonal Gradient Projection
by: Sun, Guanglong, et al.
Published: (2026)

Causal Direct Preference Optimization for Distributionally Robust Generative Recommendation
by: Zhao, Chu, et al.
Published: (2026)

Automatic Pruning Discovery for Large Language Models
by: Kang, Haidong, et al.
Published: (2025)

Chain of Risk: Safety Failures in Large Reasoning Models and Mitigation via Adaptive Multi-Principle Steering
by: Li, Xiaomin, et al.
Published: (2026)

PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models
by: Hu, Sihao, et al.
Published: (2024)

Uncovering, Explaining, and Mitigating the Superficial Safety of Backdoor Defense
by: Min, Rui, et al.
Published: (2024)

Systematic Engineering of Escherichia coli to Enhance 1,6‐Hexamethylenediamine Biosynthesis and Mitigate Byproduct 1,5‐Pentanediamine
by: Zanwen Chen, et al.
Published: (2026)