:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Singh, Simardeep
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2506.16170
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

On the Memorization of Consistency Distillation for Diffusion Models
by: Jiang, Bingqing, et al.
Published: (2026)

Membership and Memorization in LLM Knowledge Distillation
by: Zhang, Ziqi, et al.
Published: (2025)

Toward Student-Oriented Teacher Network Training For Knowledge Distillation
by: Dong, Chengyu, et al.
Published: (2022)

Teaching the Teacher: The Role of Teacher-Student Smoothness Alignment in Genetic Programming-based Symbolic Distillation
by: Dhar, Soumyadeep, et al.
Published: (2025)

Generalizing Teacher Networks for Effective Knowledge Distillation Across Student Architectures
by: Binici, Kuluhan, et al.
Published: (2024)

Masking Teacher and Reinforcing Student for Distilling Vision-Language Models
by: Lee, Byung-Kwan, et al.
Published: (2025)

Model Merging via Multi-Teacher Knowledge Distillation
by: Dalili, Seyed Arshan, et al.
Published: (2025)

The Pitfalls of Memorization: When Memorization Hurts Generalization
by: Bayat, Reza, et al.
Published: (2024)

On Teacher Hacking in Language Model Distillation
by: Tiapkin, Daniil, et al.
Published: (2025)

Memorization Sinks: Isolating Memorization during LLM Training
by: Ghosal, Gaurav R., et al.
Published: (2025)

Geometry of Human Perceptual Domains Emerges Transiently in LLM Representations
by: Singh, Simardeep, et al.
Published: (2026)

Robust Knowledge Distillation Based on Feature Variance Against Backdoored Teacher Model
by: Chen, Jinyin, et al.
Published: (2024)

Mitigating Memorization In Language Models
by: Sakarvadia, Mansi, et al.
Published: (2024)

PACED: Distillation and On-Policy Self-Distillation at the Frontier of Student Competence
by: Xu, Yuanda, et al.
Published: (2026)

Multi-Teacher Knowledge Distillation via Teacher-Informed Mixture Priors
by: Fang, Luyang, et al.
Published: (2026)

Linear Projections of Teacher Embeddings for Few-Class Distillation
by: Loo, Noel, et al.
Published: (2024)

Generalizability of Memorization Neural Networks
by: Yu, Lijia, et al.
Published: (2024)

It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
by: Behrouz, Ali, et al.
Published: (2025)

Analyzing Memorization in Large Language Models through the Lens of Model Attribution
by: Menta, Tarun Ram, et al.
Published: (2025)

DUET: Distilled LLM Unlearning from an Efficiently Contextualized Teacher
by: Zhong, Yisheng, et al.
Published: (2026)

Beyond Frequency: The Role of Redundancy in Large Language Model Memorization
by: Zhang, Jie, et al.
Published: (2025)

Memorization Control in Diffusion Models from Denoising-centric Perspective
by: Vu, Thuy Phuong, et al.
Published: (2026)

Detecting Memorization in Large Language Models
by: Slonski, Eduardo
Published: (2024)

On Memorization in Diffusion Models
by: Gu, Xiangming, et al.
Published: (2023)

Teacher-Student Learning on Complexity in Intelligent Routing
by: Pi, Shu-Ting, et al.
Published: (2024)

Memorization in deep learning: A survey
by: Wei, Jiaheng, et al.
Published: (2024)

To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraining
by: Singh, Karan, et al.
Published: (2026)

Heuristic Methods are Good Teachers to Distill MLPs for Graph Link Prediction
by: Qin, Zongyue, et al.
Published: (2025)

Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models
by: Hintersdorf, Dominik, et al.
Published: (2024)

Understanding and Mitigating Memorization in Generative Models via Sharpness of Probability Landscapes
by: Jeon, Dongjae, et al.
Published: (2024)

YODA: Teacher-Student Progressive Learning for Language Models
by: Lu, Jianqiao, et al.
Published: (2024)

Do Students Debias Like Teachers? On the Distillability of Bias Mitigation Methods
by: Cheng, Jiali, et al.
Published: (2025)

Exploring Memorization in Fine-tuned Language Models
by: Zeng, Shenglai, et al.
Published: (2023)

Are Large Language Models Memorizing Bug Benchmarks?
by: Ramos, Daniel, et al.
Published: (2024)

Memorization in In-Context Learning
by: Golchin, Shahriar, et al.
Published: (2024)

Re-understanding Graph Unlearning through Memorization
by: Ding, Pengfei, et al.
Published: (2026)

Batch Normalization Amplifies Memorization and Privacy Risks
by: Doan, Ngoc Phu, et al.
Published: (2026)

Group Relative Knowledge Distillation: Learning from Teacher's Relational Inductive Bias
by: Li, Chao, et al.
Published: (2025)

When Are Teacher Tokens Reliable? Position-Weighted On-Policy Self-Distillation for Reasoning
by: Liu, Xiaogeng, et al.
Published: (2026)

CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as Teachers
by: Nair, Lakshmi
Published: (2024)