Saved in:
| Main Author: | Singh, Simardeep |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.16170 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On the Memorization of Consistency Distillation for Diffusion Models
by: Jiang, Bingqing, et al.
Published: (2026)
by: Jiang, Bingqing, et al.
Published: (2026)
Membership and Memorization in LLM Knowledge Distillation
by: Zhang, Ziqi, et al.
Published: (2025)
by: Zhang, Ziqi, et al.
Published: (2025)
Toward Student-Oriented Teacher Network Training For Knowledge Distillation
by: Dong, Chengyu, et al.
Published: (2022)
by: Dong, Chengyu, et al.
Published: (2022)
Teaching the Teacher: The Role of Teacher-Student Smoothness Alignment in Genetic Programming-based Symbolic Distillation
by: Dhar, Soumyadeep, et al.
Published: (2025)
by: Dhar, Soumyadeep, et al.
Published: (2025)
Generalizing Teacher Networks for Effective Knowledge Distillation Across Student Architectures
by: Binici, Kuluhan, et al.
Published: (2024)
by: Binici, Kuluhan, et al.
Published: (2024)
Masking Teacher and Reinforcing Student for Distilling Vision-Language Models
by: Lee, Byung-Kwan, et al.
Published: (2025)
by: Lee, Byung-Kwan, et al.
Published: (2025)
Model Merging via Multi-Teacher Knowledge Distillation
by: Dalili, Seyed Arshan, et al.
Published: (2025)
by: Dalili, Seyed Arshan, et al.
Published: (2025)
The Pitfalls of Memorization: When Memorization Hurts Generalization
by: Bayat, Reza, et al.
Published: (2024)
by: Bayat, Reza, et al.
Published: (2024)
On Teacher Hacking in Language Model Distillation
by: Tiapkin, Daniil, et al.
Published: (2025)
by: Tiapkin, Daniil, et al.
Published: (2025)
Memorization Sinks: Isolating Memorization during LLM Training
by: Ghosal, Gaurav R., et al.
Published: (2025)
by: Ghosal, Gaurav R., et al.
Published: (2025)
Geometry of Human Perceptual Domains Emerges Transiently in LLM Representations
by: Singh, Simardeep, et al.
Published: (2026)
by: Singh, Simardeep, et al.
Published: (2026)
Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model
by: Chen, Jinyin, et al.
Published: (2024)
by: Chen, Jinyin, et al.
Published: (2024)
Mitigating Memorization In Language Models
by: Sakarvadia, Mansi, et al.
Published: (2024)
by: Sakarvadia, Mansi, et al.
Published: (2024)
PACED: Distillation and On-Policy Self-Distillation at the Frontier of Student Competence
by: Xu, Yuanda, et al.
Published: (2026)
by: Xu, Yuanda, et al.
Published: (2026)
Multi-Teacher Knowledge Distillation via Teacher-Informed Mixture Priors
by: Fang, Luyang, et al.
Published: (2026)
by: Fang, Luyang, et al.
Published: (2026)
Linear Projections of Teacher Embeddings for Few-Class Distillation
by: Loo, Noel, et al.
Published: (2024)
by: Loo, Noel, et al.
Published: (2024)
Generalizability of Memorization Neural Networks
by: Yu, Lijia, et al.
Published: (2024)
by: Yu, Lijia, et al.
Published: (2024)
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
by: Behrouz, Ali, et al.
Published: (2025)
by: Behrouz, Ali, et al.
Published: (2025)
Analyzing Memorization in Large Language Models through the Lens of Model Attribution
by: Menta, Tarun Ram, et al.
Published: (2025)
by: Menta, Tarun Ram, et al.
Published: (2025)
DUET: Distilled LLM Unlearning from an Efficiently Contextualized Teacher
by: Zhong, Yisheng, et al.
Published: (2026)
by: Zhong, Yisheng, et al.
Published: (2026)
Beyond Frequency: The Role of Redundancy in Large Language Model Memorization
by: Zhang, Jie, et al.
Published: (2025)
by: Zhang, Jie, et al.
Published: (2025)
Memorization Control in Diffusion Models from Denoising-centric Perspective
by: Vu, Thuy Phuong, et al.
Published: (2026)
by: Vu, Thuy Phuong, et al.
Published: (2026)
Detecting Memorization in Large Language Models
by: Slonski, Eduardo
Published: (2024)
by: Slonski, Eduardo
Published: (2024)
On Memorization in Diffusion Models
by: Gu, Xiangming, et al.
Published: (2023)
by: Gu, Xiangming, et al.
Published: (2023)
Teacher-Student Learning on Complexity in Intelligent Routing
by: Pi, Shu-Ting, et al.
Published: (2024)
by: Pi, Shu-Ting, et al.
Published: (2024)
Memorization in deep learning: A survey
by: Wei, Jiaheng, et al.
Published: (2024)
by: Wei, Jiaheng, et al.
Published: (2024)
To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraining
by: Singh, Karan, et al.
Published: (2026)
by: Singh, Karan, et al.
Published: (2026)
Heuristic Methods are Good Teachers to Distill MLPs for Graph Link Prediction
by: Qin, Zongyue, et al.
Published: (2025)
by: Qin, Zongyue, et al.
Published: (2025)
Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models
by: Hintersdorf, Dominik, et al.
Published: (2024)
by: Hintersdorf, Dominik, et al.
Published: (2024)
Understanding and Mitigating Memorization in Generative Models via Sharpness of Probability Landscapes
by: Jeon, Dongjae, et al.
Published: (2024)
by: Jeon, Dongjae, et al.
Published: (2024)
YODA: Teacher-Student Progressive Learning for Language Models
by: Lu, Jianqiao, et al.
Published: (2024)
by: Lu, Jianqiao, et al.
Published: (2024)
Do Students Debias Like Teachers? On the Distillability of Bias Mitigation Methods
by: Cheng, Jiali, et al.
Published: (2025)
by: Cheng, Jiali, et al.
Published: (2025)
Exploring Memorization in Fine-tuned Language Models
by: Zeng, Shenglai, et al.
Published: (2023)
by: Zeng, Shenglai, et al.
Published: (2023)
Are Large Language Models Memorizing Bug Benchmarks?
by: Ramos, Daniel, et al.
Published: (2024)
by: Ramos, Daniel, et al.
Published: (2024)
Memorization in In-Context Learning
by: Golchin, Shahriar, et al.
Published: (2024)
by: Golchin, Shahriar, et al.
Published: (2024)
Re-understanding Graph Unlearning through Memorization
by: Ding, Pengfei, et al.
Published: (2026)
by: Ding, Pengfei, et al.
Published: (2026)
Batch Normalization Amplifies Memorization and Privacy Risks
by: Doan, Ngoc Phu, et al.
Published: (2026)
by: Doan, Ngoc Phu, et al.
Published: (2026)
Group Relative Knowledge Distillation: Learning from Teacher's Relational Inductive Bias
by: Li, Chao, et al.
Published: (2025)
by: Li, Chao, et al.
Published: (2025)
When Are Teacher Tokens Reliable? Position-Weighted On-Policy Self-Distillation for Reasoning
by: Liu, Xiaogeng, et al.
Published: (2026)
by: Liu, Xiaogeng, et al.
Published: (2026)
CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as Teachers
by: Nair, Lakshmi
Published: (2024)
by: Nair, Lakshmi
Published: (2024)
Similar Items
-
On the Memorization of Consistency Distillation for Diffusion Models
by: Jiang, Bingqing, et al.
Published: (2026) -
Membership and Memorization in LLM Knowledge Distillation
by: Zhang, Ziqi, et al.
Published: (2025) -
Toward Student-Oriented Teacher Network Training For Knowledge Distillation
by: Dong, Chengyu, et al.
Published: (2022) -
Teaching the Teacher: The Role of Teacher-Student Smoothness Alignment in Genetic Programming-based Symbolic Distillation
by: Dhar, Soumyadeep, et al.
Published: (2025) -
Generalizing Teacher Networks for Effective Knowledge Distillation Across Student Architectures
by: Binici, Kuluhan, et al.
Published: (2024)