:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Huang, Pengrun, Chaudhuri, Kamalika, Wang, Yu-Xiang
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2605.06865
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Better Membership Inference Privacy Measurement through Discrepancy
by: Wu, Ruihan, et al.
Published: (2024)

Can We Infer Confidential Properties of Training Data from LLMs?
by: Huang, Pengrun, et al.
Published: (2025)

DPrivBench: Benchmarking LLMs' Reasoning for Differential Privacy
by: Wang, Erchi, et al.
Published: (2026)

A Closer Look at the Learnability of Out-of-Distribution (OOD) Detection
by: Garov, Konstantin, et al.
Published: (2025)

Learning-Time Encoding Shapes Unlearning in LLMs
by: Wu, Ruihan, et al.
Published: (2025)

Distribution Learning with Valid Outputs Beyond the Worst-Case
by: Rittler, Nick, et al.
Published: (2024)

Unified Uncertainty Calibration
by: Chaudhuri, Kamalika, et al.
Published: (2023)

Data Redaction from Conditional Generative Models
by: Kong, Zhifeng, et al.
Published: (2023)

Beyond Discrepancy: A Closer Look at the Theory of Distribution Shift
by: Bhattacharjee, Robi, et al.
Published: (2024)

Z0-Inf: Zeroth Order Approximation for Data Influence
by: Kokhlikyan, Narine, et al.
Published: (2025)

Auditing $f$-Differential Privacy in One Run
by: Mahloujifar, Saeed, et al.
Published: (2024)

Influence-based Attributions can be Manipulated
by: Yadav, Chhavi, et al.
Published: (2024)

Déjà Vu Memorization in Vision-Language Models
by: Jayaraman, Bargav, et al.
Published: (2024)

Provable Watermarking for Data Poisoning Attacks
by: Zhu, Yifan, et al.
Published: (2025)

Privacy Blur: Quantifying Privacy and Utility for Image Data Release
by: Mahloujifar, Saeed, et al.
Published: (2025)

Not-a-Bandit: Provably No-Regret Drafter Selection in Speculative Decoding for LLMs
by: Liu, Hongyi, et al.
Published: (2025)

Ward: Provable RAG Dataset Inference via LLM Watermarks
by: Jovanović, Nikola, et al.
Published: (2024)

Machine Learning with Privacy for Protected Attributes
by: Mahloujifar, Saeed, et al.
Published: (2025)

RL Is a Hammer and LLMs Are Nails: A Simple Reinforcement Learning Recipe for Strong Prompt Injection
by: Wen, Yuxin, et al.
Published: (2025)

FairProof : Confidential and Certifiable Fairness for Neural Networks
by: Yadav, Chhavi, et al.
Published: (2024)

On Differentially Private U Statistics
by: Chaudhuri, Kamalika, et al.
Published: (2024)

ExpProof : Operationalizing Explanations for Confidential Models with ZKPs
by: Yadav, Chhavi, et al.
Published: (2025)

Privacy Amplification for the Gaussian Mechanism via Bounded Support
by: Hu, Shengyuan, et al.
Published: (2024)

Implicit Identity Technologies for LLMs: Fingerprinting and Watermarking across Datasets, Models, and Generated Content
by: Liu, Bing, et al.
Published: (2026)

Measuring Déjà vu Memorization Efficiently
by: Kokhlikyan, Narine, et al.
Published: (2025)

SecAlign: Defending Against Prompt Injection with Preference Optimization
by: Chen, Sizhe, et al.
Published: (2024)

Closing the Loop: A Control-Theoretic Framework for Provably Stable Time Series Forecasting with LLMs
by: Zhang, Xingyu, et al.
Published: (2026)

Differentially Private Representation Learning via Image Captioning
by: Sander, Tom, et al.
Published: (2024)

Guarantees of confidentiality via Hammersley-Chapman-Robbins bounds
by: Chaudhuri, Kamalika, et al.
Published: (2024)

On the Empirical Power of Goodness-of-Fit Tests in Watermark Detection
by: He, Weiqing, et al.
Published: (2025)

Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation
by: Wang, Yu, et al.
Published: (2025)

Detecting Post-generation Edits to Watermarked LLM Outputs via Combinatorial Watermarking
by: Xie, Liyan, et al.
Published: (2025)

The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs
by: Baidya, Avinash, et al.
Published: (2025)

Towards Watermarking of Open-Source LLMs
by: Gloaguen, Thibaud, et al.
Published: (2025)

Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness
by: Wei, Rongzhe, et al.
Published: (2025)

WaterMAS: Sharpness-Aware Maximization for Neural Network Watermarking
by: Trias, Carl De Sousa, et al.
Published: (2024)

Provable Risk-Sensitive Distributional Reinforcement Learning with General Function Approximation
by: Chen, Yu, et al.
Published: (2024)

Towards Better Statistical Understanding of Watermarking LLMs
by: Cai, Zhongze, et al.
Published: (2024)

Discriminant Distance-Aware Representation on Deterministic Uncertainty Quantification Methods
by: Zhang, Jiaxin, et al.
Published: (2024)

BiMarker: Enhancing Text Watermark Detection for Large Language Models with Bipolar Watermarks
by: Li, Zhuang, et al.
Published: (2025)