Saved in:
| Main Authors: | Kitkana, Chayanon, Arora, Shivam |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.25779 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Subliminal Learning is a LoRA Artifact
by: Nief, Todd, et al.
Published: (2026)
by: Nief, Todd, et al.
Published: (2026)
Logit Distillation on Manifolds: Mapping by Learning
by: Yang, Yiru, et al.
Published: (2026)
by: Yang, Yiru, et al.
Published: (2026)
Emergent and Subliminal Misalignment Through the Lens of Data-Mediated Transfer
by: Askin, Baris, et al.
Published: (2026)
by: Askin, Baris, et al.
Published: (2026)
SHRED: Retain-Set-Free Unlearning via Self-Distillation with Logit Demotion
by: Hu, Zizhao, et al.
Published: (2026)
by: Hu, Zizhao, et al.
Published: (2026)
Towards Understanding Subliminal Learning: When and How Hidden Biases Transfer
by: Schrodi, Simon, et al.
Published: (2025)
by: Schrodi, Simon, et al.
Published: (2025)
Logit Dynamics in Softmax Policy Gradient Methods
by: Li, Yingru
Published: (2025)
by: Li, Yingru
Published: (2025)
Reinforcement Learning via Auxiliary Task Distillation
by: Harish, Abhinav Narayan, et al.
Published: (2024)
by: Harish, Abhinav Narayan, et al.
Published: (2024)
Peak-Controlled Logits Poisoning Attack in Federated Distillation
by: Tang, Yuhan, et al.
Published: (2024)
by: Tang, Yuhan, et al.
Published: (2024)
Subliminal Learning: Language models transmit behavioral traits via hidden signals in data
by: Cloud, Alex, et al.
Published: (2025)
by: Cloud, Alex, et al.
Published: (2025)
Harmonizing Multi-Objective LLM Unlearning via Unified Domain Representation and Bidirectional Logit Distillation
by: Zhong, Yisheng, et al.
Published: (2026)
by: Zhong, Yisheng, et al.
Published: (2026)
Tactile MNIST: Benchmarking Active Tactile Perception
by: Schneider, Tim, et al.
Published: (2025)
by: Schneider, Tim, et al.
Published: (2025)
Clarifying Model Transparency: Interpretability versus Explainability in Deep Learning with MNIST and IMDB Examples
by: Raj, Mitali
Published: (2025)
by: Raj, Mitali
Published: (2025)
Uncovering Logit Suppression Vulnerabilities in LLM Safety Alignment
by: Li, Yuxi, et al.
Published: (2024)
by: Li, Yuxi, et al.
Published: (2024)
Learning Through Noise: Why Subliminal Learning Works and When It Fails
by: Brockers, Vincent C., et al.
Published: (2026)
by: Brockers, Vincent C., et al.
Published: (2026)
Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
by: Wu, Yongtao, et al.
Published: (2025)
by: Wu, Yongtao, et al.
Published: (2025)
Efficient Knowledge Distillation via Curriculum Extraction
by: Gupta, Shivam, et al.
Published: (2025)
by: Gupta, Shivam, et al.
Published: (2025)
OGLS-SD: On-Policy Self-Distillation with Outcome-Guided Logit Steering for LLM Reasoning
by: Yang, Yuxiao, et al.
Published: (2026)
by: Yang, Yuxiao, et al.
Published: (2026)
Subliminal Learning Is Steering Vector Distillation
by: Blank, Camila, et al.
Published: (2026)
by: Blank, Camila, et al.
Published: (2026)
Building Better Deception Probes Using Targeted Instruction Pairs
by: Natarajan, Vikram, et al.
Published: (2026)
by: Natarajan, Vikram, et al.
Published: (2026)
Subliminal Effects in Your Data: A General Mechanism via Log-Linearity
by: Aden-Ali, Ishaq, et al.
Published: (2026)
by: Aden-Ali, Ishaq, et al.
Published: (2026)
Aligning Logits Generatively for Principled Black-Box Knowledge Distillation
by: Ma, Jing, et al.
Published: (2022)
by: Ma, Jing, et al.
Published: (2022)
Prioritize Alignment in Dataset Distillation
by: Li, Zekai, et al.
Published: (2024)
by: Li, Zekai, et al.
Published: (2024)
MNIST-Gen: A Modular MNIST-Style Dataset Generation Using Hierarchical Semantics, Reinforcement Learning, and Category Theory
by: Shaeri, Pouya, et al.
Published: (2025)
by: Shaeri, Pouya, et al.
Published: (2025)
CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems
by: Liu, Rui, et al.
Published: (2025)
by: Liu, Rui, et al.
Published: (2025)
SCALA: Split Federated Learning with Concatenated Activations and Logit Adjustments
by: Yang, Jiarong, et al.
Published: (2024)
by: Yang, Jiarong, et al.
Published: (2024)
Exploring the Performance of ML/DL Architectures on the MNIST-1D Dataset
by: Beebe, Michael, et al.
Published: (2026)
by: Beebe, Michael, et al.
Published: (2026)
Robust and Resource-Efficient Data-Free Knowledge Distillation by Generative Pseudo Replay
by: Binici, Kuluhan, et al.
Published: (2022)
by: Binici, Kuluhan, et al.
Published: (2022)
Logit Distance Bounds Representational Similarity
by: Nielsen, Beatrix M. G., et al.
Published: (2026)
by: Nielsen, Beatrix M. G., et al.
Published: (2026)
LearnAlign: Data Selection for LLM Reinforcement Learning with Improved Gradient Alignment
by: Li, Shipeng, et al.
Published: (2025)
by: Li, Shipeng, et al.
Published: (2025)
Multi-Task Reinforcement Learning with Language-Encoded Gated Policy Networks
by: Arora, Rushiv
Published: (2025)
by: Arora, Rushiv
Published: (2025)
Evaluating the Effectiveness of Data Augmentation for Emotion Classification in Low-Resource Settings
by: Arora, Aashish, et al.
Published: (2024)
by: Arora, Aashish, et al.
Published: (2024)
Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs
by: Anshumann, et al.
Published: (2025)
by: Anshumann, et al.
Published: (2025)
Decoupled Split Learning via Auxiliary Loss
by: Zihad, Anower, et al.
Published: (2026)
by: Zihad, Anower, et al.
Published: (2026)
A Hybrid Multi-Well Hopfield-CNN with Feature Extraction and K-Means for MNIST Classification
by: Farooq, Ahmed
Published: (2025)
by: Farooq, Ahmed
Published: (2025)
Multi-Knowledge Fusion Network for Time Series Representation Learning
by: Sakhinana, Sagar Srinivas, et al.
Published: (2024)
by: Sakhinana, Sagar Srinivas, et al.
Published: (2024)
Revisiting the Initial Steps in Adaptive Gradient Descent Optimization
by: Abuduweili, Abulikemu, et al.
Published: (2024)
by: Abuduweili, Abulikemu, et al.
Published: (2024)
Condensed Data Expansion Using Model Inversion for Knowledge Distillation
by: Binici, Kuluhan, et al.
Published: (2024)
by: Binici, Kuluhan, et al.
Published: (2024)
Spectral Logit Sculpting: Adaptive Low-Rank Logit Transformation for Controlled Text Generation
by: Li, Jin, et al.
Published: (2025)
by: Li, Jin, et al.
Published: (2025)
Personalized Subgraph Federated Learning with Differentiable Auxiliary Projections
by: Zhuo, Wei, et al.
Published: (2025)
by: Zhuo, Wei, et al.
Published: (2025)
Improving Continual Learning Performance and Efficiency with Auxiliary Classifiers
by: Szatkowski, Filip, et al.
Published: (2024)
by: Szatkowski, Filip, et al.
Published: (2024)
Similar Items
-
Subliminal Learning is a LoRA Artifact
by: Nief, Todd, et al.
Published: (2026) -
Logit Distillation on Manifolds: Mapping by Learning
by: Yang, Yiru, et al.
Published: (2026) -
Emergent and Subliminal Misalignment Through the Lens of Data-Mediated Transfer
by: Askin, Baris, et al.
Published: (2026) -
SHRED: Retain-Set-Free Unlearning via Self-Distillation with Logit Demotion
by: Hu, Zizhao, et al.
Published: (2026) -
Towards Understanding Subliminal Learning: When and How Hidden Biases Transfer
by: Schrodi, Simon, et al.
Published: (2025)