:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Ruiheng, Chen, XiaoBing, Zhang, Jinyu, Zhang, Qiongwen, Zhang, Yu, Yang, Bailong
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2511.06778
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Filling Memory Gaps: Enhancing Continual Semantic Parsing via SQL Syntax Variance-Guided LLMs without Real Data Replay
by: Liu, Ruiheng, et al.
Published: (2024)

Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
by: Hong, Zijin, et al.
Published: (2024)

Evil Geniuses: Delving into the Safety of LLM-based Agents
by: Tian, Yu, et al.
Published: (2023)

SaRO: Enhancing LLM Safety through Reasoning-based Alignment
by: Mou, Yutao, et al.
Published: (2025)

Decoupling Safety into Orthogonal Subspace: Cost-Efficient and Performance-Preserving Alignment for Large Language Models
by: Mou, Yutao, et al.
Published: (2025)

Privacy-Preserving Models for Legal Natural Language Processing
by: Yin, Ying, et al.
Published: (2022)

Advancing LLM Safe Alignment with Safety Representation Ranking
by: Du, Tianqi, et al.
Published: (2025)

LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety
by: Yang, Junxiao, et al.
Published: (2026)

MPO: Multilingual Safety Alignment via Reward Gap Optimization
by: Zhao, Weixiang, et al.
Published: (2025)

Unforgotten Safety: Preserving Safety Alignment of Large Language Models with Continual Learning
by: Alssum, Lama, et al.
Published: (2025)

Towards Comprehensive Post Safety Alignment of Large Language Models via Safety Patching
by: Zhao, Weixiang, et al.
Published: (2024)

Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents
by: Li, Zelong, et al.
Published: (2024)

STAIR: Improving Safety Alignment with Introspective Reasoning
by: Zhang, Yichi, et al.
Published: (2025)

Language Evolution for Evading Social Media Regulation via LLM-based Multi-agent Simulation
by: Cai, Jinyu, et al.
Published: (2024)

MoGU: A Framework for Enhancing Safety of Open-Sourced LLMs While Preserving Their Usability
by: Du, Yanrui, et al.
Published: (2024)

Anonymous-by-Construction: An LLM-Driven Framework for Privacy-Preserving Text
by: Albanese, Federico, et al.
Published: (2026)

Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward Modeling
by: Deng, Qiyuan, et al.
Published: (2025)

Multilingual Blending: LLM Safety Alignment Evaluation with Language Mixture
by: Song, Jiayang, et al.
Published: (2024)

PrivGemo: Privacy-Preserving Dual-Tower Graph Retrieval for Empowering LLM Reasoning with Memory Augmentation
by: Tan, Xingyu, et al.
Published: (2026)

Orchestration-Free Customer Service Automation: A Privacy-Preserving and Flowchart-Guided Framework
by: Hong, Mengze, et al.
Published: (2026)

SQLucid: Grounding Natural Language Database Queries with Interactive Explanations
by: Tian, Yuan, et al.
Published: (2024)

GR-SAP: Generative Replay for Safety Alignment Preservation during Fine-Tuning
by: Fang, Zhouxiang, et al.
Published: (2026)

Privacy-Preserving Retrieval-Augmented Generation with Differential Privacy
by: Koga, Tatsuki, et al.
Published: (2024)

PAPILLON: Privacy Preservation from Internet-based and Local Language Model Ensembles
by: Siyan, Li, et al.
Published: (2024)

Language-Native Materials Processing Design by Lightly Structured Text Database and Reasoning Large Language Model
by: Liu, Yuze, et al.
Published: (2025)

PRIV-QA: Privacy-Preserving Question Answering for Cloud Large Language Models
by: Li, Guangwei, et al.
Published: (2025)

Evaluating and Improving Cultural Awareness of Reward Models for LLM Alignment
by: Zhang, Hongbin, et al.
Published: (2025)

Privacy-Preserving Language Model Inference with Instance Obfuscation
by: Yao, Yixiang, et al.
Published: (2024)

Generative Interfaces for Language Models
by: Chen, Jiaqi, et al.
Published: (2025)

Jailbreaking Attacks vs. Content Safety Filters: How Far Are We in the LLM Safety Arms Race?
by: Xin, Yuan, et al.
Published: (2025)

Privacy Preserving In-Context-Learning Framework for Large Language Models
by: Bhusal, Bishnu, et al.
Published: (2025)

Understanding Layer Significance in LLM Alignment
by: Shi, Guangyuan, et al.
Published: (2024)

The Hidden Dimensions of LLM Alignment: A Multi-Dimensional Analysis of Orthogonal Safety Directions
by: Pan, Wenbo, et al.
Published: (2025)

Privacy-Preserving Instructions for Aligning Large Language Models
by: Yu, Da, et al.
Published: (2024)

Safety Is Not Universal: The Selective Safety Trap in LLM Alignment
by: Brito, Iago Alves, et al.
Published: (2026)

Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
by: Zhou, Zhanhui, et al.
Published: (2024)

How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
by: Zhou, Zhenhong, et al.
Published: (2024)

PFID: Privacy First Inference Delegation Framework for LLMs
by: Yang, Haoyan, et al.
Published: (2024)

Semantics-Preserved Distortion for Personal Privacy Protection in Information Management
by: Li, Jiajia, et al.
Published: (2022)

Situated Natural Language Explanations
by: Zhu, Zining, et al.
Published: (2023)