Saved in:
| Main Authors: | Luo, Yifan, Zhou, Zhennan, Wang, Meitan, Dong, Bin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.10150 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
by: Liu, Fan, et al.
Published: (2024)
by: Liu, Fan, et al.
Published: (2024)
Defending LLMs against Jailbreaking Attacks via Backtranslation
by: Wang, Yihan, et al.
Published: (2024)
by: Wang, Yihan, et al.
Published: (2024)
Instruction Tuning With Loss Over Instructions
by: Shi, Zhengyan, et al.
Published: (2024)
by: Shi, Zhengyan, et al.
Published: (2024)
Playing Language Game with LLMs Leads to Jailbreaking
by: Peng, Yu, et al.
Published: (2024)
by: Peng, Yu, et al.
Published: (2024)
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
by: Wu, Xuansheng, et al.
Published: (2023)
by: Wu, Xuansheng, et al.
Published: (2023)
ClimateChat: Designing Data and Methods for Instruction Tuning LLMs to Answer Climate Change Queries
by: Chen, Zhou, et al.
Published: (2025)
by: Chen, Zhou, et al.
Published: (2025)
Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation
by: He, Linda, et al.
Published: (2025)
by: He, Linda, et al.
Published: (2025)
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
by: Guo, Xingang, et al.
Published: (2024)
by: Guo, Xingang, et al.
Published: (2024)
SentenceVAE: Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context
by: An, Hongjun, et al.
Published: (2024)
by: An, Hongjun, et al.
Published: (2024)
Span-level Emotion-Cause-Category Triplet Extraction with Instruction Tuning LLMs and Data Augmentation
by: Li, Xiangju, et al.
Published: (2025)
by: Li, Xiangju, et al.
Published: (2025)
Phased Instruction Fine-Tuning for Large Language Models
by: Pang, Wei, et al.
Published: (2024)
by: Pang, Wei, et al.
Published: (2024)
SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
by: Liu, Liangxin, et al.
Published: (2024)
by: Liu, Liangxin, et al.
Published: (2024)
GemmAr: Enhancing LLMs Through Arabic Instruction-Tuning
by: Chouikhi, Hasna, et al.
Published: (2024)
by: Chouikhi, Hasna, et al.
Published: (2024)
How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs?
by: Doostmohammadi, Ehsan, et al.
Published: (2024)
by: Doostmohammadi, Ehsan, et al.
Published: (2024)
ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting
by: Tian, Yuxing, et al.
Published: (2026)
by: Tian, Yuxing, et al.
Published: (2026)
Jailbreaking LLMs via Calibration
by: Lu, Yuxuan, et al.
Published: (2026)
by: Lu, Yuxuan, et al.
Published: (2026)
CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment
by: Lin, Geyu, et al.
Published: (2024)
by: Lin, Geyu, et al.
Published: (2024)
Jailbreak-Tuning: Models Efficiently Learn Jailbreak Susceptibility
by: Murphy, Brendan, et al.
Published: (2025)
by: Murphy, Brendan, et al.
Published: (2025)
InverseScope: Scalable Activation Inversion for Interpreting Large Language Models
by: Luo, Yifan, et al.
Published: (2025)
by: Luo, Yifan, et al.
Published: (2025)
Poisoned LangChain: Jailbreak LLMs by LangChain
by: Wang, Ziqiu, et al.
Published: (2024)
by: Wang, Ziqiu, et al.
Published: (2024)
From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking
by: Wang, Siyuan, et al.
Published: (2024)
by: Wang, Siyuan, et al.
Published: (2024)
Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning
by: Hasan, Adib, et al.
Published: (2024)
by: Hasan, Adib, et al.
Published: (2024)
Teaching According to Talents! Instruction Tuning LLMs with Competence-Aware Curriculum Learning
by: Li, Yangning, et al.
Published: (2025)
by: Li, Yangning, et al.
Published: (2025)
Bridging Writing Manner Gap in Visual Instruction Tuning by Creating LLM-aligned Instructions
by: Jing, Dong, et al.
Published: (2025)
by: Jing, Dong, et al.
Published: (2025)
Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning
by: Zan, Changtong, et al.
Published: (2024)
by: Zan, Changtong, et al.
Published: (2024)
EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
by: Zhou, Weikang, et al.
Published: (2024)
by: Zhou, Weikang, et al.
Published: (2024)
Fight Back Against Jailbreaking via Prompt Adversarial Tuning
by: Mo, Yichuan, et al.
Published: (2024)
by: Mo, Yichuan, et al.
Published: (2024)
Jailbreaking to Jailbreak
by: Kritz, Jeremy, et al.
Published: (2025)
by: Kritz, Jeremy, et al.
Published: (2025)
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
by: Chen, Mingyang, et al.
Published: (2025)
by: Chen, Mingyang, et al.
Published: (2025)
MIST: Jailbreaking Black-box Large Language Models via Iterative Semantic Tuning
by: Zheng, Muyang, et al.
Published: (2025)
by: Zheng, Muyang, et al.
Published: (2025)
Uncovering the Persuasive Fingerprint of LLMs in Jailbreaking Attacks
by: Noughabi, Havva Alizadeh, et al.
Published: (2025)
by: Noughabi, Havva Alizadeh, et al.
Published: (2025)
Efficient LLM-Jailbreaking via Multimodal-LLM Jailbreak
by: Ji, Haoxuan, et al.
Published: (2024)
by: Ji, Haoxuan, et al.
Published: (2024)
Contrastive Instruction Tuning
by: Yan, Tianyi Lorena, et al.
Published: (2024)
by: Yan, Tianyi Lorena, et al.
Published: (2024)
A Comparative Analysis of Instruction Fine-Tuning LLMs for Financial Text Classification
by: Fatemi, Sorouralsadat, et al.
Published: (2024)
by: Fatemi, Sorouralsadat, et al.
Published: (2024)
Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning
by: Verma, Pulkit, et al.
Published: (2025)
by: Verma, Pulkit, et al.
Published: (2025)
MEXMA: Token-level objectives improve sentence representations
by: Janeiro, João Maria, et al.
Published: (2024)
by: Janeiro, João Maria, et al.
Published: (2024)
Foot-In-The-Door: A Multi-turn Jailbreak for LLMs
by: Weng, Zixuan, et al.
Published: (2025)
by: Weng, Zixuan, et al.
Published: (2025)
Rapid Optimization for Jailbreaking LLMs via Subconscious Exploitation and Echopraxia
by: Shen, Guangyu, et al.
Published: (2024)
by: Shen, Guangyu, et al.
Published: (2024)
Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages
by: Mao, Zhuoyuan, et al.
Published: (2024)
by: Mao, Zhuoyuan, et al.
Published: (2024)
Learning to Instruct for Visual Instruction Tuning
by: Zhou, Zhihan, et al.
Published: (2025)
by: Zhou, Zhihan, et al.
Published: (2025)
Similar Items
-
Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
by: Liu, Fan, et al.
Published: (2024) -
Defending LLMs against Jailbreaking Attacks via Backtranslation
by: Wang, Yihan, et al.
Published: (2024) -
Instruction Tuning With Loss Over Instructions
by: Shi, Zhengyan, et al.
Published: (2024) -
Playing Language Game with LLMs Leads to Jailbreaking
by: Peng, Yu, et al.
Published: (2024) -
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
by: Wu, Xuansheng, et al.
Published: (2023)