:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Luo, Yifan, Zhou, Zhennan, Wang, Meitan, Dong, Bin
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2410.10150
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Adversarial Tuning: Defending Against Jailbreak Attacks for LLMs
by: Liu, Fan, et al.
Published: (2024)

Defending LLMs against Jailbreaking Attacks via Backtranslation
by: Wang, Yihan, et al.
Published: (2024)

Instruction Tuning With Loss Over Instructions
by: Shi, Zhengyan, et al.
Published: (2024)

Playing Language Game with LLMs Leads to Jailbreaking
by: Peng, Yu, et al.
Published: (2024)

From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
by: Wu, Xuansheng, et al.
Published: (2023)

ClimateChat: Designing Data and Methods for Instruction Tuning LLMs to Answer Climate Change Queries
by: Chen, Zhou, et al.
Published: (2025)

Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation
by: He, Linda, et al.
Published: (2025)

COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
by: Guo, Xingang, et al.
Published: (2024)

SentenceVAE: Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context
by: An, Hongjun, et al.
Published: (2024)

Span-level Emotion-Cause-Category Triplet Extraction with Instruction Tuning LLMs and Data Augmentation
by: Li, Xiangju, et al.
Published: (2025)

Phased Instruction Fine-Tuning for Large Language Models
by: Pang, Wei, et al.
Published: (2024)

SelectIT: Selective Instruction Tuning for LLMs via Uncertainty-Aware Self-Reflection
by: Liu, Liangxin, et al.
Published: (2024)

GemmAr: Enhancing LLMs Through Arabic Instruction-Tuning
by: Chouikhi, Hasna, et al.
Published: (2024)

How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs?
by: Doostmohammadi, Ehsan, et al.
Published: (2024)

ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting
by: Tian, Yuxing, et al.
Published: (2026)

Jailbreaking LLMs via Calibration
by: Lu, Yuxuan, et al.
Published: (2026)

CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment
by: Lin, Geyu, et al.
Published: (2024)

Jailbreak-Tuning: Models Efficiently Learn Jailbreak Susceptibility
by: Murphy, Brendan, et al.
Published: (2025)

InverseScope: Scalable Activation Inversion for Interpreting Large Language Models
by: Luo, Yifan, et al.
Published: (2025)

Poisoned LangChain: Jailbreak LLMs by LangChain
by: Wang, Ziqiu, et al.
Published: (2024)

From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking
by: Wang, Siyuan, et al.
Published: (2024)

Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning
by: Hasan, Adib, et al.
Published: (2024)

Teaching According to Talents! Instruction Tuning LLMs with Competence-Aware Curriculum Learning
by: Li, Yangning, et al.
Published: (2025)

Bridging Writing Manner Gap in Visual Instruction Tuning by Creating LLM-aligned Instructions
by: Jing, Dong, et al.
Published: (2025)

Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning
by: Zan, Changtong, et al.
Published: (2024)

EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
by: Zhou, Weikang, et al.
Published: (2024)

Fight Back Against Jailbreaking via Prompt Adversarial Tuning
by: Mo, Yichuan, et al.
Published: (2024)

Jailbreaking to Jailbreak
by: Kritz, Jeremy, et al.
Published: (2025)

ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
by: Chen, Mingyang, et al.
Published: (2025)

MIST: Jailbreaking Black-box Large Language Models via Iterative Semantic Tuning
by: Zheng, Muyang, et al.
Published: (2025)

Uncovering the Persuasive Fingerprint of LLMs in Jailbreaking Attacks
by: Noughabi, Havva Alizadeh, et al.
Published: (2025)

Efficient LLM-Jailbreaking via Multimodal-LLM Jailbreak
by: Ji, Haoxuan, et al.
Published: (2024)

Contrastive Instruction Tuning
by: Yan, Tianyi Lorena, et al.
Published: (2024)

A Comparative Analysis of Instruction Fine-Tuning LLMs for Financial Text Classification
by: Fatemi, Sorouralsadat, et al.
Published: (2024)

Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning
by: Verma, Pulkit, et al.
Published: (2025)

MEXMA: Token-level objectives improve sentence representations
by: Janeiro, João Maria, et al.
Published: (2024)

Foot-In-The-Door: A Multi-turn Jailbreak for LLMs
by: Weng, Zixuan, et al.
Published: (2025)

Rapid Optimization for Jailbreaking LLMs via Subconscious Exploitation and Echopraxia
by: Shen, Guangyu, et al.
Published: (2024)

Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages
by: Mao, Zhuoyuan, et al.
Published: (2024)

Learning to Instruct for Visual Instruction Tuning
by: Zhou, Zhihan, et al.
Published: (2025)