APA-Zitierstil (7. Ausg.)

Ravindran, S. K. (2025). Adversarial Activation Patching: A Framework for Detecting and Mitigating Emergent Deception in Safety-Aligned Transformers.

Chicago-Zitierstil (17. Ausg.)

Ravindran, Santhosh Kumar. Adversarial Activation Patching: A Framework for Detecting and Mitigating Emergent Deception in Safety-Aligned Transformers. 2025.

MLA-Zitierstil (9. Ausg.)

Ravindran, Santhosh Kumar. Adversarial Activation Patching: A Framework for Detecting and Mitigating Emergent Deception in Safety-Aligned Transformers. 2025.

Achtung: Diese Zitate sind unter Umständen nicht zu 100% korrekt.