Saved in:
| Main Authors: | Wyllie, Sierra, Shumailov, Ilia, Papernot, Nicolas |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.07857 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Beyond Labeling Oracles: What does it mean to steal ML models?
by: Shafran, Avital, et al.
Published: (2023)
by: Shafran, Avital, et al.
Published: (2023)
The Curse of Recursion: Training on Generated Data Makes Models Forget
by: Shumailov, Ilia, et al.
Published: (2023)
by: Shumailov, Ilia, et al.
Published: (2023)
Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy
by: Hayes, Jamie, et al.
Published: (2024)
by: Hayes, Jamie, et al.
Published: (2024)
Beyond Laplace and Gaussian: Exploring the Generalized Gaussian Mechanism for Private Machine Learning
by: Rinberg, Roy, et al.
Published: (2025)
by: Rinberg, Roy, et al.
Published: (2025)
When Vision Fails: Text Attacks Against ViT and OCR
by: Boucher, Nicholas, et al.
Published: (2023)
by: Boucher, Nicholas, et al.
Published: (2023)
Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD
by: Thudi, Anvith, et al.
Published: (2023)
by: Thudi, Anvith, et al.
Published: (2023)
Architectural Neural Backdoors from First Principles
by: Langford, Harry, et al.
Published: (2024)
by: Langford, Harry, et al.
Published: (2024)
Backdoor Detection through Replicated Execution of Outsourced Training
by: Jia, Hengrui, et al.
Published: (2025)
by: Jia, Hengrui, et al.
Published: (2025)
Architectural Backdoors for Within-Batch Data Stealing and Model Inference Manipulation
by: Küchler, Nicolas, et al.
Published: (2025)
by: Küchler, Nicolas, et al.
Published: (2025)
Buffer Overflow in Mixture of Experts
by: Hayes, Jamie, et al.
Published: (2024)
by: Hayes, Jamie, et al.
Published: (2024)
Cascading Adversarial Bias from Injection to Distillation in Language Models
by: Chaudhari, Harsh, et al.
Published: (2025)
by: Chaudhari, Harsh, et al.
Published: (2025)
SEA: Shareable and Explainable Attribution for Query-based Black-box Attacks
by: Gao, Yue, et al.
Published: (2023)
by: Gao, Yue, et al.
Published: (2023)
UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI
by: Shumailov, Ilia, et al.
Published: (2024)
by: Shumailov, Ilia, et al.
Published: (2024)
Machine Learning needs Better Randomness Standards: Randomised Smoothing and PRNG-based attacks
by: Dahiya, Pranav, et al.
Published: (2023)
by: Dahiya, Pranav, et al.
Published: (2023)
Contextual Feedback Loops: Amplifying Deep Reasoning with Iterative Top-Down Feedback
by: Fein-Ashley, Jacob, et al.
Published: (2024)
by: Fein-Ashley, Jacob, et al.
Published: (2024)
Private Rate-Constrained Optimization with Applications to Fair Learning
by: Yaghini, Mohammad, et al.
Published: (2025)
by: Yaghini, Mohammad, et al.
Published: (2025)
What Does It Take to Build a Performant Selective Classifier?
by: Rabanser, Stephan, et al.
Published: (2025)
by: Rabanser, Stephan, et al.
Published: (2025)
ceLLMate: Sandboxing Browser AI Agents
by: Meng, Luoxi, et al.
Published: (2025)
by: Meng, Luoxi, et al.
Published: (2025)
Quantamination: Dynamic Quantization Leaks Your Data Across the Batch
by: Foerster, Hanna, et al.
Published: (2026)
by: Foerster, Hanna, et al.
Published: (2026)
Beyond Slow Signs in High-fidelity Model Extraction
by: Foerster, Hanna, et al.
Published: (2024)
by: Foerster, Hanna, et al.
Published: (2024)
Measuring memorization in RLHF for code completion
by: Pappu, Aneesh, et al.
Published: (2024)
by: Pappu, Aneesh, et al.
Published: (2024)
Watermarking Needs Input Repetition Masking
by: Khachaturov, David, et al.
Published: (2025)
by: Khachaturov, David, et al.
Published: (2025)
Breach By A Thousand Leaks: Unsafe Information Leakage in `Safe' AI Responses
by: Glukhov, David, et al.
Published: (2024)
by: Glukhov, David, et al.
Published: (2024)
Stealing User Prompts from Mixture of Experts
by: Yona, Itay, et al.
Published: (2024)
by: Yona, Itay, et al.
Published: (2024)
Hardware and Software Platform Inference
by: Zhang, Cheng, et al.
Published: (2024)
by: Zhang, Cheng, et al.
Published: (2024)
Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?
by: Zhang, Cheng, et al.
Published: (2023)
by: Zhang, Cheng, et al.
Published: (2023)
ImpNet: Imperceptible and blackbox-undetectable backdoors in compiled neural networks
by: Clifford, Eleanor, et al.
Published: (2022)
by: Clifford, Eleanor, et al.
Published: (2022)
Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model
by: Cebere, Tudor, et al.
Published: (2024)
by: Cebere, Tudor, et al.
Published: (2024)
SIPDO: Closed-Loop Prompt Optimization via Synthetic Data Feedback
by: Yu, Yaoning, et al.
Published: (2025)
by: Yu, Yaoning, et al.
Published: (2025)
Large Language Models Can Verbatim Reproduce Long Malicious Sequences
by: Lin, Sharon, et al.
Published: (2025)
by: Lin, Sharon, et al.
Published: (2025)
Soft Instruction De-escalation Defense
by: Walter, Nils Philipp, et al.
Published: (2025)
by: Walter, Nils Philipp, et al.
Published: (2025)
Interpreting the Repeated Token Phenomenon in Large Language Models
by: Yona, Itay, et al.
Published: (2025)
by: Yona, Itay, et al.
Published: (2025)
Machine Learning Models Have a Supply Chain Problem
by: Meiklejohn, Sarah, et al.
Published: (2025)
by: Meiklejohn, Sarah, et al.
Published: (2025)
Fast Exact Unlearning for In-Context Learning Data for LLMs
by: Muresanu, Andrei I., et al.
Published: (2024)
by: Muresanu, Andrei I., et al.
Published: (2024)
Selective Prediction via Training Dynamics
by: Rabanser, Stephan, et al.
Published: (2022)
by: Rabanser, Stephan, et al.
Published: (2022)
MCRAGE: Synthetic Healthcare Data for Fairness
by: Behal, Keira, et al.
Published: (2023)
by: Behal, Keira, et al.
Published: (2023)
Adversarial Bias: Data Poisoning Attacks on Fairness
by: Chan, Eunice, et al.
Published: (2025)
by: Chan, Eunice, et al.
Published: (2025)
FairFinGAN: Fairness-aware Synthetic Financial Data Generation
by: Quy, Tai Le, et al.
Published: (2026)
by: Quy, Tai Le, et al.
Published: (2026)
LLM Dataset Inference: Did you train on my dataset?
by: Maini, Pratyush, et al.
Published: (2024)
by: Maini, Pratyush, et al.
Published: (2024)
Regulation Games for Trustworthy Machine Learning
by: Yaghini, Mohammad, et al.
Published: (2024)
by: Yaghini, Mohammad, et al.
Published: (2024)
Similar Items
-
Beyond Labeling Oracles: What does it mean to steal ML models?
by: Shafran, Avital, et al.
Published: (2023) -
The Curse of Recursion: Training on Generated Data Makes Models Forget
by: Shumailov, Ilia, et al.
Published: (2023) -
Inexact Unlearning Needs More Careful Evaluations to Avoid a False Sense of Privacy
by: Hayes, Jamie, et al.
Published: (2024) -
Beyond Laplace and Gaussian: Exploring the Generalized Gaussian Mechanism for Private Machine Learning
by: Rinberg, Roy, et al.
Published: (2025) -
When Vision Fails: Text Attacks Against ViT and OCR
by: Boucher, Nicholas, et al.
Published: (2023)