MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autore principale:	Rivasseau, Thomas
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2512.03001
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866914177731264512
author	Rivasseau, Thomas
author_facet	Rivasseau, Thomas
contents	Current research on operator control of Large Language Models improves model robustness against adversarial attacks and misbehavior by training on preference examples, prompting, and input/output filtering. Despite good results, LLMs remain susceptible to abuse, and jailbreak probability increases with context length. There is a need for robust LLM security guarantees in long-context situations. We propose control sentences inserted into the LLM context as invasive context engineering to partially solve the problem. We suggest this technique can be generalized to the Chain-of-Thought process to prevent scheming. Invasive Context Engineering does not rely on LLM training, avoiding data shortage pitfalls which arise in training models for long context situations.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_03001
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Invasive Context Engineering to Control Large Language Models Rivasseau, Thomas Artificial Intelligence Current research on operator control of Large Language Models improves model robustness against adversarial attacks and misbehavior by training on preference examples, prompting, and input/output filtering. Despite good results, LLMs remain susceptible to abuse, and jailbreak probability increases with context length. There is a need for robust LLM security guarantees in long-context situations. We propose control sentences inserted into the LLM context as invasive context engineering to partially solve the problem. We suggest this technique can be generalized to the Chain-of-Thought process to prevent scheming. Invasive Context Engineering does not rely on LLM training, avoiding data shortage pitfalls which arise in training models for long context situations.
title	Invasive Context Engineering to Control Large Language Models
topic	Artificial Intelligence
url	https://arxiv.org/abs/2512.03001

Documenti analoghi