Saved in:
| Main Authors: | Üyük, Cem, Lasby, Mike, Yassin, Mohamed, Evci, Utku, Ioannou, Yani |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.09816 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Dynamic Sparse Training with Structured Sparsity
by: Lasby, Mike, et al.
Published: (2023)
by: Lasby, Mike, et al.
Published: (2023)
Navigating Extremes: Dynamic Sparsity in Large Output Spaces
by: Ullah, Nasib, et al.
Published: (2024)
by: Ullah, Nasib, et al.
Published: (2024)
REAP the Experts: Why Pruning Prevails for One-Shot MoE compression
by: Lasby, Mike, et al.
Published: (2025)
by: Lasby, Mike, et al.
Published: (2025)
SD$^2$: Self-Distilled Sparse Drafters
by: Lasby, Mike, et al.
Published: (2025)
by: Lasby, Mike, et al.
Published: (2025)
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws
by: Jin, Tian, et al.
Published: (2025)
by: Jin, Tian, et al.
Published: (2025)
What is Left After Distillation? How Knowledge Transfer Impacts Fairness and Bias
by: Mohammadshahi, Aida, et al.
Published: (2024)
by: Mohammadshahi, Aida, et al.
Published: (2024)
Compression Scaling Laws:Unifying Sparsity and Quantization
by: Frantar, Elias, et al.
Published: (2025)
by: Frantar, Elias, et al.
Published: (2025)
Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
by: Adnan, Mohammed, et al.
Published: (2025)
by: Adnan, Mohammed, et al.
Published: (2025)
SparseOpt: Addressing Normalization-induced Gradient Skew in Sparse Training
by: Adnan, Mohammed, et al.
Published: (2026)
by: Adnan, Mohammed, et al.
Published: (2026)
Towards Optimal Adapter Placement for Efficient Transfer Learning
by: Nowak, Aleksandra I., et al.
Published: (2024)
by: Nowak, Aleksandra I., et al.
Published: (2024)
Meta-GCN: A Dynamically Weighted Loss Minimization Method for Dealing with the Data Imbalance in Graph Neural Networks
by: Mohammadizadeh, Mahdi, et al.
Published: (2024)
by: Mohammadizadeh, Mahdi, et al.
Published: (2024)
Self-Data Distillation for Recovering Quality in Pruned Large Language Models
by: Thangarasa, Vithursan, et al.
Published: (2024)
by: Thangarasa, Vithursan, et al.
Published: (2024)
Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers
by: Bambhaniya, Abhimanyu Rajeshkumar, et al.
Published: (2024)
by: Bambhaniya, Abhimanyu Rajeshkumar, et al.
Published: (2024)
Functional Bayesian Tucker Decomposition for Continuous-indexed Tensor Data
by: Fang, Shikai, et al.
Published: (2023)
by: Fang, Shikai, et al.
Published: (2023)
The AL$\ell_0$CORE Tensor Decomposition for Sparse Count Data
by: Hood, John, et al.
Published: (2024)
by: Hood, John, et al.
Published: (2024)
Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations
by: Si, Chongjie, et al.
Published: (2025)
by: Si, Chongjie, et al.
Published: (2025)
ReLATE: Learning Efficient Sparse Encoding for High-Performance Tensor Decomposition
by: Helal, Ahmed E., et al.
Published: (2025)
by: Helal, Ahmed E., et al.
Published: (2025)
Distribution-Aware Tensor Decomposition for Compression of Convolutional Neural Networks
by: Kalle, Alper, et al.
Published: (2025)
by: Kalle, Alper, et al.
Published: (2025)
Learning Without Augmenting: Unsupervised Time Series Representation Learning via Frame Projections
by: Demirel, Berken Utku, et al.
Published: (2025)
by: Demirel, Berken Utku, et al.
Published: (2025)
Multi-view Graph Condensation via Tensor Decomposition
by: Santos, Nícolas Roque dos, et al.
Published: (2025)
by: Santos, Nícolas Roque dos, et al.
Published: (2025)
Robust Anomaly Detection via Tensor Pseudoskeleton Decomposition
by: Su, Bowen
Published: (2025)
by: Su, Bowen
Published: (2025)
No-Rank Tensor Decomposition Using Metric Learning
by: Bagherian, Maryam
Published: (2025)
by: Bagherian, Maryam
Published: (2025)
Learning Symmetries via Weight-Sharing with Doubly Stochastic Tensors
by: van der Linden, Putri A., et al.
Published: (2024)
by: van der Linden, Putri A., et al.
Published: (2024)
On Catastrophic Forgetting in Low-Rank Decomposition-Based Parameter-Efficient Fine-Tuning
by: Ahmad, Muhammad, et al.
Published: (2026)
by: Ahmad, Muhammad, et al.
Published: (2026)
XMoE: Sparse Models with Fine-grained and Adaptive Expert Selection
by: Yang, Yuanhang, et al.
Published: (2024)
by: Yang, Yuanhang, et al.
Published: (2024)
Global and Local Structure Learning for Sparse Tensor Completion
by: Ahn, Dawon, et al.
Published: (2025)
by: Ahn, Dawon, et al.
Published: (2025)
Toward Temporal Causal Representation Learning with Tensor Decomposition
by: Chen, Jianhong, et al.
Published: (2025)
by: Chen, Jianhong, et al.
Published: (2025)
MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training
by: Zhao, Pinxue, et al.
Published: (2024)
by: Zhao, Pinxue, et al.
Published: (2024)
Tensor Dynamic Mode Decomposition
by: He, Ziqin, et al.
Published: (2025)
by: He, Ziqin, et al.
Published: (2025)
Overcomplete Tensor Decomposition via Koszul-Young Flattenings
by: Kothari, Pravesh K., et al.
Published: (2024)
by: Kothari, Pravesh K., et al.
Published: (2024)
Graph-Based Spectral Decomposition for Parameter Coordination in Language Model Fine-Tuning
by: Zhang, Hanlu, et al.
Published: (2025)
by: Zhang, Hanlu, et al.
Published: (2025)
Sparse Orthogonal Parameters Tuning for Continual Learning
by: Ning, Kun-Peng, et al.
Published: (2024)
by: Ning, Kun-Peng, et al.
Published: (2024)
Shifting the Paradigm: A Diffeomorphism Between Time Series Data Manifolds for Achieving Shift-Invariancy in Deep Learning
by: Demirel, Berken Utku, et al.
Published: (2025)
by: Demirel, Berken Utku, et al.
Published: (2025)
HiDe-PET: Continual Learning via Hierarchical Decomposition of Parameter-Efficient Tuning
by: Wang, Liyuan, et al.
Published: (2024)
by: Wang, Liyuan, et al.
Published: (2024)
Rethinking Parameter Sharing for LLM Fine-Tuning with Multiple LoRAs
by: Ban, Hao, et al.
Published: (2025)
by: Ban, Hao, et al.
Published: (2025)
Functional Complexity-adaptive Temporal Tensor Decomposition
by: Chen, Panqi, et al.
Published: (2025)
by: Chen, Panqi, et al.
Published: (2025)
Tensor Convolutional Network for Higher-Order Interaction Prediction in Sparse Tensors
by: Jang, Jun-Gi, et al.
Published: (2025)
by: Jang, Jun-Gi, et al.
Published: (2025)
Fourier Low-rank and Sparse Tensor for Efficient Tensor Completion
by: Li, Jingyang, et al.
Published: (2025)
by: Li, Jingyang, et al.
Published: (2025)
Computational and Statistical Guarantees for Tensor-on-Tensor Regression with Tensor Train Decomposition
by: Qin, Zhen, et al.
Published: (2024)
by: Qin, Zhen, et al.
Published: (2024)
Examining Changes in Internal Representations of Continual Learning Models Through Tensor Decomposition
by: Aswani, Nishant Suresh, et al.
Published: (2024)
by: Aswani, Nishant Suresh, et al.
Published: (2024)
Similar Items
-
Dynamic Sparse Training with Structured Sparsity
by: Lasby, Mike, et al.
Published: (2023) -
Navigating Extremes: Dynamic Sparsity in Large Output Spaces
by: Ullah, Nasib, et al.
Published: (2024) -
REAP the Experts: Why Pruning Prevails for One-Shot MoE compression
by: Lasby, Mike, et al.
Published: (2025) -
SD$^2$: Self-Distilled Sparse Drafters
by: Lasby, Mike, et al.
Published: (2025) -
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws
by: Jin, Tian, et al.
Published: (2025)