Saved in:
| Main Authors: | Liu, Ruiheng, Chen, XiaoBing, Zhang, Jinyu, Zhang, Qiongwen, Zhang, Yu, Yang, Bailong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.06778 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Filling Memory Gaps: Enhancing Continual Semantic Parsing via SQL Syntax Variance-Guided LLMs without Real Data Replay
by: Liu, Ruiheng, et al.
Published: (2024)
by: Liu, Ruiheng, et al.
Published: (2024)
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
by: Hong, Zijin, et al.
Published: (2024)
by: Hong, Zijin, et al.
Published: (2024)
Evil Geniuses: Delving into the Safety of LLM-based Agents
by: Tian, Yu, et al.
Published: (2023)
by: Tian, Yu, et al.
Published: (2023)
SaRO: Enhancing LLM Safety through Reasoning-based Alignment
by: Mou, Yutao, et al.
Published: (2025)
by: Mou, Yutao, et al.
Published: (2025)
Decoupling Safety into Orthogonal Subspace: Cost-Efficient and Performance-Preserving Alignment for Large Language Models
by: Mou, Yutao, et al.
Published: (2025)
by: Mou, Yutao, et al.
Published: (2025)
Privacy-Preserving Models for Legal Natural Language Processing
by: Yin, Ying, et al.
Published: (2022)
by: Yin, Ying, et al.
Published: (2022)
Advancing LLM Safe Alignment with Safety Representation Ranking
by: Du, Tianqi, et al.
Published: (2025)
by: Du, Tianqi, et al.
Published: (2025)
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety
by: Yang, Junxiao, et al.
Published: (2026)
by: Yang, Junxiao, et al.
Published: (2026)
MPO: Multilingual Safety Alignment via Reward Gap Optimization
by: Zhao, Weixiang, et al.
Published: (2025)
by: Zhao, Weixiang, et al.
Published: (2025)
Unforgotten Safety: Preserving Safety Alignment of Large Language Models with Continual Learning
by: Alssum, Lama, et al.
Published: (2025)
by: Alssum, Lama, et al.
Published: (2025)
Towards Comprehensive Post Safety Alignment of Large Language Models via Safety Patching
by: Zhao, Weixiang, et al.
Published: (2024)
by: Zhao, Weixiang, et al.
Published: (2024)
Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents
by: Li, Zelong, et al.
Published: (2024)
by: Li, Zelong, et al.
Published: (2024)
STAIR: Improving Safety Alignment with Introspective Reasoning
by: Zhang, Yichi, et al.
Published: (2025)
by: Zhang, Yichi, et al.
Published: (2025)
Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation
by: Cai, Jinyu, et al.
Published: (2024)
by: Cai, Jinyu, et al.
Published: (2024)
MoGU: A Framework for Enhancing Safety of Open-Sourced LLMs While Preserving Their Usability
by: Du, Yanrui, et al.
Published: (2024)
by: Du, Yanrui, et al.
Published: (2024)
Anonymous-by-Construction: An LLM-Driven Framework for Privacy-Preserving Text
by: Albanese, Federico, et al.
Published: (2026)
by: Albanese, Federico, et al.
Published: (2026)
Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward Modeling
by: Deng, Qiyuan, et al.
Published: (2025)
by: Deng, Qiyuan, et al.
Published: (2025)
Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture
by: Song, Jiayang, et al.
Published: (2024)
by: Song, Jiayang, et al.
Published: (2024)
PrivGemo: Privacy-Preserving Dual-Tower Graph Retrieval for Empowering LLM Reasoning with Memory Augmentation
by: Tan, Xingyu, et al.
Published: (2026)
by: Tan, Xingyu, et al.
Published: (2026)
Orchestration-Free Customer Service Automation: A Privacy-Preserving and Flowchart-Guided Framework
by: Hong, Mengze, et al.
Published: (2026)
by: Hong, Mengze, et al.
Published: (2026)
SQLucid: Grounding Natural Language Database Queries with Interactive Explanations
by: Tian, Yuan, et al.
Published: (2024)
by: Tian, Yuan, et al.
Published: (2024)
GR-SAP: Generative Replay for Safety Alignment Preservation during Fine-Tuning
by: Fang, Zhouxiang, et al.
Published: (2026)
by: Fang, Zhouxiang, et al.
Published: (2026)
Privacy-Preserving Retrieval-Augmented Generation with Differential Privacy
by: Koga, Tatsuki, et al.
Published: (2024)
by: Koga, Tatsuki, et al.
Published: (2024)
PAPILLON: Privacy Preservation from Internet-based and Local Language Model Ensembles
by: Siyan, Li, et al.
Published: (2024)
by: Siyan, Li, et al.
Published: (2024)
Language-Native Materials Processing Design by Lightly Structured Text Database and Reasoning Large Language Model
by: Liu, Yuze, et al.
Published: (2025)
by: Liu, Yuze, et al.
Published: (2025)
PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models
by: Li, Guangwei, et al.
Published: (2025)
by: Li, Guangwei, et al.
Published: (2025)
Evaluating and Improving Cultural Awareness of Reward Models for LLM Alignment
by: Zhang, Hongbin, et al.
Published: (2025)
by: Zhang, Hongbin, et al.
Published: (2025)
Privacy-Preserving Language Model Inference with Instance Obfuscation
by: Yao, Yixiang, et al.
Published: (2024)
by: Yao, Yixiang, et al.
Published: (2024)
Generative Interfaces for Language Models
by: Chen, Jiaqi, et al.
Published: (2025)
by: Chen, Jiaqi, et al.
Published: (2025)
Jailbreaking Attacks vs. Content Safety Filters: How Far Are We in the LLM Safety Arms Race?
by: Xin, Yuan, et al.
Published: (2025)
by: Xin, Yuan, et al.
Published: (2025)
Privacy Preserving In-Context-Learning Framework for Large Language Models
by: Bhusal, Bishnu, et al.
Published: (2025)
by: Bhusal, Bishnu, et al.
Published: (2025)
Understanding Layer Significance in LLM Alignment
by: Shi, Guangyuan, et al.
Published: (2024)
by: Shi, Guangyuan, et al.
Published: (2024)
The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Analysis of Orthogonal Safety Directions
by: Pan, Wenbo, et al.
Published: (2025)
by: Pan, Wenbo, et al.
Published: (2025)
Privacy-Preserving Instructions for Aligning Large Language Models
by: Yu, Da, et al.
Published: (2024)
by: Yu, Da, et al.
Published: (2024)
Safety Is Not Universal: The Selective Safety Trap in LLM Alignment
by: Brito, Iago Alves, et al.
Published: (2026)
by: Brito, Iago Alves, et al.
Published: (2026)
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
by: Zhou, Zhanhui, et al.
Published: (2024)
by: Zhou, Zhanhui, et al.
Published: (2024)
How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
by: Zhou, Zhenhong, et al.
Published: (2024)
by: Zhou, Zhenhong, et al.
Published: (2024)
PFID: Privacy First Inference Delegation Framework for LLMs
by: Yang, Haoyan, et al.
Published: (2024)
by: Yang, Haoyan, et al.
Published: (2024)
Semantics-Preserved Distortion for Personal Privacy Protection in Information Management
by: Li, Jiajia, et al.
Published: (2022)
by: Li, Jiajia, et al.
Published: (2022)
Situated Natural Language Explanations
by: Zhu, Zining, et al.
Published: (2023)
by: Zhu, Zining, et al.
Published: (2023)
Similar Items
-
Filling Memory Gaps: Enhancing Continual Semantic Parsing via SQL Syntax Variance-Guided LLMs without Real Data Replay
by: Liu, Ruiheng, et al.
Published: (2024) -
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
by: Hong, Zijin, et al.
Published: (2024) -
Evil Geniuses: Delving into the Safety of LLM-based Agents
by: Tian, Yu, et al.
Published: (2023) -
SaRO: Enhancing LLM Safety through Reasoning-based Alignment
by: Mou, Yutao, et al.
Published: (2025) -
Decoupling Safety into Orthogonal Subspace: Cost-Efficient and Performance-Preserving Alignment for Large Language Models
by: Mou, Yutao, et al.
Published: (2025)