Saved in:
| Main Authors: | Alballa, Norah, Abdelmoniem, Ahmed M., Canini, Marco |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.14922 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Query-based Knowledge Transfer for Heterogeneous Learning Environments
by: Alballa, Norah, et al.
Published: (2025)
by: Alballa, Norah, et al.
Published: (2025)
Flashback: Understanding and Mitigating Forgetting in Federated Learning
by: Aljahdali, Mohammed, et al.
Published: (2024)
by: Aljahdali, Mohammed, et al.
Published: (2024)
DeepFusion: Accelerating MoE Training via Federated Knowledge Distillation from Heterogeneous Edge Devices
by: Li, Songyuan, et al.
Published: (2026)
by: Li, Songyuan, et al.
Published: (2026)
A Meta-learning based Stacked Regression Approach for Customer Lifetime Value Prediction
by: Gadgil, Karan, et al.
Published: (2023)
by: Gadgil, Karan, et al.
Published: (2023)
Stock Market Price Prediction: A Hybrid LSTM and Sequential Self-Attention based Approach
by: Pardeshi, Karan, et al.
Published: (2023)
by: Pardeshi, Karan, et al.
Published: (2023)
Federated Knowledge Transfer Fine-tuning Large Server Model with Resource-Constrained IoT Clients
by: Chen, Shaoyuan, et al.
Published: (2024)
by: Chen, Shaoyuan, et al.
Published: (2024)
On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models
by: Farhat, Sean, et al.
Published: (2024)
by: Farhat, Sean, et al.
Published: (2024)
RAP: KV-Cache Compression via RoPE-Aligned Pruning
by: Xin, Jihao, et al.
Published: (2026)
by: Xin, Jihao, et al.
Published: (2026)
Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG
by: Wei, Xinxu, et al.
Published: (2024)
by: Wei, Xinxu, et al.
Published: (2024)
ACING: Actor-Critic for Instruction Learning in Black-Box LLMs
by: Kharrat, Salma, et al.
Published: (2024)
by: Kharrat, Salma, et al.
Published: (2024)
Toward Student-Oriented Teacher Network Training For Knowledge Distillation
by: Dong, Chengyu, et al.
Published: (2022)
by: Dong, Chengyu, et al.
Published: (2022)
Dataset Distillation for Pre-Trained Self-Supervised Vision Models
by: Cazenavette, George, et al.
Published: (2025)
by: Cazenavette, George, et al.
Published: (2025)
Training Plug-n-Play Knowledge Modules with Deep Context Distillation
by: Caccia, Lucas, et al.
Published: (2025)
by: Caccia, Lucas, et al.
Published: (2025)
Panther: Faster and Cheaper Computations with Randomized Numerical Linear Algebra
by: Seddik, Fahd, et al.
Published: (2026)
by: Seddik, Fahd, et al.
Published: (2026)
Decentralized Personalized Federated Learning
by: Kharrat, Salma, et al.
Published: (2024)
by: Kharrat, Salma, et al.
Published: (2024)
When Fewer Layers Break More Chains: Layer Pruning Harms Test-Time Scaling in LLMs
by: Wang, Keyu, et al.
Published: (2025)
by: Wang, Keyu, et al.
Published: (2025)
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
by: Hron, Jiri, et al.
Published: (2024)
by: Hron, Jiri, et al.
Published: (2024)
FLStore: Efficient Federated Learning Storage for non-training workloads
by: Khan, Ahmad Faraz, et al.
Published: (2025)
by: Khan, Ahmad Faraz, et al.
Published: (2025)
Training an LLM-as-a-Judge Model: Pipeline, Insights, and Practical Lessons
by: Hu, Renjun, et al.
Published: (2025)
by: Hu, Renjun, et al.
Published: (2025)
Training Domain Draft Models for Speculative Decoding: Best Practices and Insights
by: Hong, Fenglu, et al.
Published: (2025)
by: Hong, Fenglu, et al.
Published: (2025)
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
by: Bick, Aviv, et al.
Published: (2024)
by: Bick, Aviv, et al.
Published: (2024)
Model Merging via Multi-Teacher Knowledge Distillation
by: Dalili, Seyed Arshan, et al.
Published: (2025)
by: Dalili, Seyed Arshan, et al.
Published: (2025)
A Communication and Computation Efficient Fully First-order Method for Decentralized Bilevel Optimization
by: Wen, Min, et al.
Published: (2024)
by: Wen, Min, et al.
Published: (2024)
Multi-Stage Knowledge-Distilled VGAE and GAT for Robust Controller-Area-Network Intrusion Detection
by: Frenken, Robert, et al.
Published: (2025)
by: Frenken, Robert, et al.
Published: (2025)
FilFL: Client Filtering for Optimized Client Participation in Federated Learning
by: Fourati, Fares, et al.
Published: (2023)
by: Fourati, Fares, et al.
Published: (2023)
A Time Series Multitask Framework Integrating a Large Language Model, Pre-Trained Time Series Model, and Knowledge Graph
by: Hao, Shule, et al.
Published: (2025)
by: Hao, Shule, et al.
Published: (2025)
A Survey on Time-Series Pre-Trained Models
by: Ma, Qianli, et al.
Published: (2023)
by: Ma, Qianli, et al.
Published: (2023)
Condensed Data Expansion Using Model Inversion for Knowledge Distillation
by: Binici, Kuluhan, et al.
Published: (2024)
by: Binici, Kuluhan, et al.
Published: (2024)
Graph Knowledge Distillation to Mixture of Experts
by: Rumiantsev, Pavel, et al.
Published: (2024)
by: Rumiantsev, Pavel, et al.
Published: (2024)
Dynamic Temperature Scheduler for Knowledge Distillation
by: Islam, Sibgat Ul, et al.
Published: (2025)
by: Islam, Sibgat Ul, et al.
Published: (2025)
Membership and Memorization in LLM Knowledge Distillation
by: Zhang, Ziqi, et al.
Published: (2025)
by: Zhang, Ziqi, et al.
Published: (2025)
Training Heterogeneous Client Models using Knowledge Distillation in Serverless Federated Learning
by: Chadha, Mohak, et al.
Published: (2024)
by: Chadha, Mohak, et al.
Published: (2024)
Topology Only Pre-Training: Towards Generalised Multi-Domain Graph Models
by: Davies, Alex O., et al.
Published: (2023)
by: Davies, Alex O., et al.
Published: (2023)
Task-Agnostic Federation over Decentralized Data: Research Landscape and Visions
by: Wu, Wentai, et al.
Published: (2025)
by: Wu, Wentai, et al.
Published: (2025)
KD-GAT: Combining Knowledge Distillation and Graph Attention Transformer for a Controller Area Network Intrusion Detection System
by: Frenken, Robert, et al.
Published: (2025)
by: Frenken, Robert, et al.
Published: (2025)
Split Knowledge Distillation for Large Models in IoT: Architecture, Challenges, and Solutions
by: Li, Zuguang, et al.
Published: (2024)
by: Li, Zuguang, et al.
Published: (2024)
Model Mimic Attack: Knowledge Distillation for Provably Transferable Adversarial Examples
by: Lukyanov, Kirill, et al.
Published: (2024)
by: Lukyanov, Kirill, et al.
Published: (2024)
Enhancing Knowledge Graph Completion with GNN Distillation and Probabilistic Interaction Modeling
by: Wang, Lingzhi, et al.
Published: (2025)
by: Wang, Lingzhi, et al.
Published: (2025)
Ensemble of Pre-Trained Models for Long-Tailed Trajectory Prediction
by: Thuremella, Divya, et al.
Published: (2025)
by: Thuremella, Divya, et al.
Published: (2025)
Delta Knowledge Distillation for Large Language Models
by: Cao, Yihan, et al.
Published: (2025)
by: Cao, Yihan, et al.
Published: (2025)
Similar Items
-
Query-based Knowledge Transfer for Heterogeneous Learning Environments
by: Alballa, Norah, et al.
Published: (2025) -
Flashback: Understanding and Mitigating Forgetting in Federated Learning
by: Aljahdali, Mohammed, et al.
Published: (2024) -
DeepFusion: Accelerating MoE Training via Federated Knowledge Distillation from Heterogeneous Edge Devices
by: Li, Songyuan, et al.
Published: (2026) -
A Meta-learning based Stacked Regression Approach for Customer Lifetime Value Prediction
by: Gadgil, Karan, et al.
Published: (2023) -
Stock Market Price Prediction: A Hybrid LSTM and Sequential Self-Attention based Approach
by: Pardeshi, Karan, et al.
Published: (2023)