Saved in:
| Main Author: | Park, Sejik |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2403.19941 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ResidualDroppath: Enhancing Feature Reuse over Residual Connections
by: Park, Sejik
Published: (2024)
by: Park, Sejik
Published: (2024)
Learning More Generalized Experts by Merging Experts in Mixture-of-Experts
by: Park, Sejik
Published: (2024)
by: Park, Sejik
Published: (2024)
Self-Normalized Resets for Plasticity in Continual Learning
by: Farias, Vivek F., et al.
Published: (2024)
by: Farias, Vivek F., et al.
Published: (2024)
Retro: Reusing teacher projection head for efficient embedding distillation on Lightweight Models via Self-supervised Learning
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
by: Nguyen, Khanh-Binh, et al.
Published: (2024)
Reset-free Reinforcement Learning with World Models
by: Yang, Zhao, et al.
Published: (2024)
by: Yang, Zhao, et al.
Published: (2024)
The Power of Resets in Online Reinforcement Learning
by: Mhammedi, Zakaria, et al.
Published: (2024)
by: Mhammedi, Zakaria, et al.
Published: (2024)
Single-Reset Divide & Conquer Imitation Learning
by: Chenu, Alexandre, et al.
Published: (2024)
by: Chenu, Alexandre, et al.
Published: (2024)
Self-supervised network distillation: an effective approach to exploration in sparse reward environments
by: Pecháč, Matej, et al.
Published: (2023)
by: Pecháč, Matej, et al.
Published: (2023)
SPARE: Self-distillation for PARameter-Efficient Removal
by: Mola, Natnael, et al.
Published: (2026)
by: Mola, Natnael, et al.
Published: (2026)
Smooth-Distill: A Self-distillation Framework for Multitask Learning with Wearable Sensor Data
by: Vu, Hoang-Dieu, et al.
Published: (2025)
by: Vu, Hoang-Dieu, et al.
Published: (2025)
Credit Assignment with Resets in Language Model Reasoning
by: Samanta, Ankur, et al.
Published: (2026)
by: Samanta, Ankur, et al.
Published: (2026)
Intelligent Switching for Reset-Free RL
by: Patil, Darshan, et al.
Published: (2024)
by: Patil, Darshan, et al.
Published: (2024)
Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset
by: Galashov, Alexandre, et al.
Published: (2024)
by: Galashov, Alexandre, et al.
Published: (2024)
sDREAMER: Self-distilled Mixture-of-Modality-Experts Transformer for Automatic Sleep Staging
by: Chen, Jingyuan, et al.
Published: (2025)
by: Chen, Jingyuan, et al.
Published: (2025)
Reset & Distill: A Recipe for Overcoming Negative Transfer in Continual Reinforcement Learning
by: Ahn, Hongjoon, et al.
Published: (2024)
by: Ahn, Hongjoon, et al.
Published: (2024)
Dataset Reset Policy Optimization for RLHF
by: Chang, Jonathan D., et al.
Published: (2024)
by: Chang, Jonathan D., et al.
Published: (2024)
Towards Lightweight and Stable Zero-shot TTS with Self-distilled Representation Disentanglement
by: Chen, Qianniu, et al.
Published: (2025)
by: Chen, Qianniu, et al.
Published: (2025)
Education distillation:getting student models to learn in shcools
by: Feng, Ling, et al.
Published: (2023)
by: Feng, Ling, et al.
Published: (2023)
A Reinforcement Learning based Reset Policy for CDCL SAT Solvers
by: Li, Chunxiao, et al.
Published: (2024)
by: Li, Chunxiao, et al.
Published: (2024)
Metis-SPECS: Decoupling Multimodal Learning via Self-distilled Preference-based Cold Start
by: Chen, Kun, et al.
Published: (2025)
by: Chen, Kun, et al.
Published: (2025)
IDLM: Inverse-distilled Diffusion Language Models
by: Li, David, et al.
Published: (2026)
by: Li, David, et al.
Published: (2026)
Self-Knowledge Distillation for Learning Ambiguity
by: Park, Hancheol, et al.
Published: (2024)
by: Park, Hancheol, et al.
Published: (2024)
Spotlighting Partially Visible Cinematic Language for Video-to-Audio Generation via Self-distillation
by: Huang, Feizhen, et al.
Published: (2025)
by: Huang, Feizhen, et al.
Published: (2025)
OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework
by: Chen, Ben, et al.
Published: (2026)
by: Chen, Ben, et al.
Published: (2026)
Learning Diverse Policies with Soft Self-Generated Guidance
by: Wang, Guojian, et al.
Published: (2024)
by: Wang, Guojian, et al.
Published: (2024)
On student-teacher deviations in distillation: does it pay to disobey?
by: Nagarajan, Vaishnavh, et al.
Published: (2023)
by: Nagarajan, Vaishnavh, et al.
Published: (2023)
Knowledge distillation through geometry-aware representational alignment
by: Bhattarai, Prajjwal, et al.
Published: (2025)
by: Bhattarai, Prajjwal, et al.
Published: (2025)
Quality-Diversity Actor-Critic: Learning High-Performing and Diverse Behaviors via Value and Successor Features Critics
by: Grillotti, Luca, et al.
Published: (2024)
by: Grillotti, Luca, et al.
Published: (2024)
VendiRL: A Framework for Self-Supervised Reinforcement Learning of Diversely Diverse Skills
by: Lintunen, Erik M.
Published: (2025)
by: Lintunen, Erik M.
Published: (2025)
Towards a theory of model distillation
by: Boix-Adsera, Enric
Published: (2024)
by: Boix-Adsera, Enric
Published: (2024)
VertiBench: Advancing Feature Distribution Diversity in Vertical Federated Learning Benchmarks
by: Wu, Zhaomin, et al.
Published: (2023)
by: Wu, Zhaomin, et al.
Published: (2023)
DSLR: Diversity Enhancement and Structure Learning for Rehearsal-based Graph Continual Learning
by: Choi, Seungyoon, et al.
Published: (2024)
by: Choi, Seungyoon, et al.
Published: (2024)
TabKD: Tabular Knowledge Distillation through Interaction Diversity of Learned Feature Bins
by: Pereira, Shovon Niverd, et al.
Published: (2026)
by: Pereira, Shovon Niverd, et al.
Published: (2026)
Never Reset Again: A Mathematical Framework for Continual Inference in Recurrent Neural Networks
by: Yin, Bojian, et al.
Published: (2024)
by: Yin, Bojian, et al.
Published: (2024)
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
by: Huang, Tianjin, et al.
Published: (2025)
by: Huang, Tianjin, et al.
Published: (2025)
Trust the uncertain teacher: distilling dark knowledge via calibrated uncertainty
by: Kim, Jeonghyun, et al.
Published: (2026)
by: Kim, Jeonghyun, et al.
Published: (2026)
ADMEDTAGGER: an annotation framework for distillation of expert knowledge for the Polish medical language
by: Górski, Franciszek, et al.
Published: (2025)
by: Górski, Franciszek, et al.
Published: (2025)
On Pretraining Data Diversity for Self-Supervised Learning
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2024)
by: Hammoud, Hasan Abed Al Kader, et al.
Published: (2024)
Tabular Feature Discovery With Reasoning Type Exploration
by: Han, Sungwon, et al.
Published: (2025)
by: Han, Sungwon, et al.
Published: (2025)
Stochastic Resetting Mitigates Latent Gradient Bias of SGD from Label Noise
by: Bae, Youngkyoung, et al.
Published: (2024)
by: Bae, Youngkyoung, et al.
Published: (2024)
Similar Items
-
ResidualDroppath: Enhancing Feature Reuse over Residual Connections
by: Park, Sejik
Published: (2024) -
Learning More Generalized Experts by Merging Experts in Mixture-of-Experts
by: Park, Sejik
Published: (2024) -
Self-Normalized Resets for Plasticity in Continual Learning
by: Farias, Vivek F., et al.
Published: (2024) -
Retro: Reusing teacher projection head for efficient embedding distillation on Lightweight Models via Self-supervised Learning
by: Nguyen, Khanh-Binh, et al.
Published: (2024) -
Reset-free Reinforcement Learning with World Models
by: Yang, Zhao, et al.
Published: (2024)