:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wyllie, Sierra, Shumailov, Ilia, Papernot, Nicolas
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2403.07857
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Beyond Labeling Oracles: What does it mean to steal ML models?
by: Shafran, Avital, et al.
Published: (2023)

The Curse of Recursion: Training on Generated Data Makes Models Forget
by: Shumailov, Ilia, et al.
Published: (2023)

Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy
by: Hayes, Jamie, et al.
Published: (2024)

Beyond Laplace and Gaussian: Exploring the Generalized Gaussian Mechanism for Private Machine Learning
by: Rinberg, Roy, et al.
Published: (2025)

When Vision Fails: Text Attacks Against ViT and OCR
by: Boucher, Nicholas, et al.
Published: (2023)

Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD
by: Thudi, Anvith, et al.
Published: (2023)

Architectural Neural Backdoors from First Principles
by: Langford, Harry, et al.
Published: (2024)

Backdoor Detection through Replicated Execution of Outsourced Training
by: Jia, Hengrui, et al.
Published: (2025)

Architectural Backdoors for Within-Batch Data Stealing and Model Inference Manipulation
by: Küchler, Nicolas, et al.
Published: (2025)

Buffer Overflow in Mixture of Experts
by: Hayes, Jamie, et al.
Published: (2024)

Cascading Adversarial Bias from Injection to Distillation in Language Models
by: Chaudhari, Harsh, et al.
Published: (2025)

SEA: Shareable and Explainable Attribution for Query-based Black-box Attacks
by: Gao, Yue, et al.
Published: (2023)

UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI
by: Shumailov, Ilia, et al.
Published: (2024)

Machine Learning needs Better Randomness Standards: Randomised Smoothing and PRNG-based attacks
by: Dahiya, Pranav, et al.
Published: (2023)

Contextual Feedback Loops: Amplifying Deep Reasoning with Iterative Top-Down Feedback
by: Fein-Ashley, Jacob, et al.
Published: (2024)

Private Rate-Constrained Optimization with Applications to Fair Learning
by: Yaghini, Mohammad, et al.
Published: (2025)

What Does It Take to Build a Performant Selective Classifier?
by: Rabanser, Stephan, et al.
Published: (2025)

ceLLMate: Sandboxing Browser AI Agents
by: Meng, Luoxi, et al.
Published: (2025)

Quantamination: Dynamic Quantization Leaks Your Data Across the Batch
by: Foerster, Hanna, et al.
Published: (2026)

Beyond Slow Signs in High-fidelity Model Extraction
by: Foerster, Hanna, et al.
Published: (2024)

Measuring memorization in RLHF for code completion
by: Pappu, Aneesh, et al.
Published: (2024)

Watermarking Needs Input Repetition Masking
by: Khachaturov, David, et al.
Published: (2025)

Breach By A Thousand Leaks: Unsafe Information Leakage in `Safe' AI Responses
by: Glukhov, David, et al.
Published: (2024)

Stealing User Prompts from Mixture of Experts
by: Yona, Itay, et al.
Published: (2024)

Hardware and Software Platform Inference
by: Zhang, Cheng, et al.
Published: (2024)

Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?
by: Zhang, Cheng, et al.
Published: (2023)

ImpNet: Imperceptible and blackbox-undetectable backdoors in compiled neural networks
by: Clifford, Eleanor, et al.
Published: (2022)

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model
by: Cebere, Tudor, et al.
Published: (2024)

SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback
by: Yu, Yaoning, et al.
Published: (2025)

Large Language Models Can Verbatim Reproduce Long Malicious Sequences
by: Lin, Sharon, et al.
Published: (2025)

Soft Instruction De-escalation Defense
by: Walter, Nils Philipp, et al.
Published: (2025)

Interpreting the Repeated Token Phenomenon in Large Language Models
by: Yona, Itay, et al.
Published: (2025)

Machine Learning Models Have a Supply Chain Problem
by: Meiklejohn, Sarah, et al.
Published: (2025)

Fast Exact Unlearning for In-Context Learning Data for LLMs
by: Muresanu, Andrei I., et al.
Published: (2024)

Selective Prediction via Training Dynamics
by: Rabanser, Stephan, et al.
Published: (2022)

MCRAGE: Synthetic Healthcare Data for Fairness
by: Behal, Keira, et al.
Published: (2023)

Adversarial Bias: Data Poisoning Attacks on Fairness
by: Chan, Eunice, et al.
Published: (2025)

FairFinGAN: Fairness-aware Synthetic Financial Data Generation
by: Quy, Tai Le, et al.
Published: (2026)

LLM Dataset Inference: Did you train on my dataset?
by: Maini, Pratyush, et al.
Published: (2024)

Regulation Games for Trustworthy Machine Learning
by: Yaghini, Mohammad, et al.
Published: (2024)