Saved in:
Bibliographic Details
Main Author: Rivasseau, Thomas
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2512.03001
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Current research on operator control of Large Language Models improves model robustness against adversarial attacks and misbehavior by training on preference examples, prompting, and input/output filtering. Despite good results, LLMs remain susceptible to abuse, and jailbreak probability increases with context length. There is a need for robust LLM security guarantees in long-context situations. We propose control sentences inserted into the LLM context as invasive context engineering to partially solve the problem. We suggest this technique can be generalized to the Chain-of-Thought process to prevent scheming. Invasive Context Engineering does not rely on LLM training, avoiding data shortage pitfalls which arise in training models for long context situations.