Saved in:
| Main Authors: | Conzelmann, Alexander, Catalan-Tatjer, Albert, Liu, Shiwei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.06366 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Training Dynamics Impact Post-Training Quantization Robustness
by: Catalan-Tatjer, Albert, et al.
Published: (2025)
by: Catalan-Tatjer, Albert, et al.
Published: (2025)
Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding
by: Conzelmann, Alexander, et al.
Published: (2025)
by: Conzelmann, Alexander, et al.
Published: (2025)
Decentralized Task Offloading and Load-Balancing for Mobile Edge Computing in Dense Networks
by: Yahya, Mariam, et al.
Published: (2024)
by: Yahya, Mariam, et al.
Published: (2024)
AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
by: Lu, Haiquan, et al.
Published: (2024)
by: Lu, Haiquan, et al.
Published: (2024)
Quantifying Error Propagation and Model Collapse in Diffusion Models
by: Khelifa, Nail B., et al.
Published: (2026)
by: Khelifa, Nail B., et al.
Published: (2026)
ActTail: Global Activation Sparsity in Large Language Models
by: Hou, Wenwen, et al.
Published: (2026)
by: Hou, Wenwen, et al.
Published: (2026)
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN
by: Li, Pengxiang, et al.
Published: (2024)
by: Li, Pengxiang, et al.
Published: (2024)
dgMARK: Decoding-Guided Watermarking for Diffusion Language Models
by: Hong, Pyo Min, et al.
Published: (2026)
by: Hong, Pyo Min, et al.
Published: (2026)
Path-Dependent Denoising: A Non-Conservative Field Perspective on Order Collapse in Diffusion Language Models
by: Kim, Jeonseong
Published: (2026)
by: Kim, Jeonseong
Published: (2026)
On the Collapse Errors Induced by the Deterministic Sampler for Diffusion Models
by: Zhang, Yi, et al.
Published: (2025)
by: Zhang, Yi, et al.
Published: (2025)
Dominating vs. Dominated: Generative Collapse in Diffusion Models
by: Jeong, Hayeon, et al.
Published: (2025)
by: Jeong, Hayeon, et al.
Published: (2025)
Layer Collapse Can be Induced by Unstructured Pruning
by: Liao, Zhu, et al.
Published: (2024)
by: Liao, Zhu, et al.
Published: (2024)
LayerCollapse: Adaptive compression of neural networks
by: Shabgahi, Soheil Zibakhsh, et al.
Published: (2023)
by: Shabgahi, Soheil Zibakhsh, et al.
Published: (2023)
LaCoOT: Layer Collapse through Optimal Transport
by: Quétu, Victor, et al.
Published: (2024)
by: Quétu, Victor, et al.
Published: (2024)
When Fewer Layers Break More Chains: Layer Pruning Harms Test-Time Scaling in LLMs
by: Wang, Keyu, et al.
Published: (2025)
by: Wang, Keyu, et al.
Published: (2025)
The Curse of Depth in Large Language Models
by: Sun, Wenfang, et al.
Published: (2025)
by: Sun, Wenfang, et al.
Published: (2025)
Strong Model Collapse
by: Dohmatob, Elvis, et al.
Published: (2024)
by: Dohmatob, Elvis, et al.
Published: (2024)
Collapse-Free Prototype Readout Layer for Transformer Encoders
by: Cirrincione, Giansalvo, et al.
Published: (2026)
by: Cirrincione, Giansalvo, et al.
Published: (2026)
Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine
by: Huang, Wei, et al.
Published: (2026)
by: Huang, Wei, et al.
Published: (2026)
Language Generation with Replay: A Learning-Theoretic View of Model Collapse
by: Racca, Giorgio, et al.
Published: (2026)
by: Racca, Giorgio, et al.
Published: (2026)
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients
by: Zhang, Zhenyu, et al.
Published: (2024)
by: Zhang, Zhenyu, et al.
Published: (2024)
Diffusion Language Models for Speech Recognition
by: Naveriani, Davyd, et al.
Published: (2026)
by: Naveriani, Davyd, et al.
Published: (2026)
Model Unmerging: Making Your Models Unmergeable for Secure Model Sharing
by: Wang, Zihao, et al.
Published: (2025)
by: Wang, Zihao, et al.
Published: (2025)
Monitoring Neural Training with Topology: A Footprint-Predictable Collapse Index
by: Kalinowski, Alexander
Published: (2026)
by: Kalinowski, Alexander
Published: (2026)
When One Modality Rules Them All: Backdoor Modality Collapse in Multimodal Diffusion Models
by: Wang, Qitong, et al.
Published: (2026)
by: Wang, Qitong, et al.
Published: (2026)
Variational Language Concepts for Interpreting Foundation Language Models
by: Wang, Hengyi, et al.
Published: (2024)
by: Wang, Hengyi, et al.
Published: (2024)
Mind the Gap: a Spectral Analysis of Rank Collapse and Signal Propagation in Attention Layers
by: Saada, Thiziri Nait, et al.
Published: (2024)
by: Saada, Thiziri Nait, et al.
Published: (2024)
When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models
by: Sanyal, Sunny, et al.
Published: (2024)
by: Sanyal, Sunny, et al.
Published: (2024)
Multi-modal Synthetic Data Training and Model Collapse: Insights from VLMs and Diffusion Models
by: Hu, Zizhao, et al.
Published: (2025)
by: Hu, Zizhao, et al.
Published: (2025)
DiffusionOPD: A Unified Perspective of On-Policy Distillation in Diffusion Models
by: Li, Quanhao, et al.
Published: (2026)
by: Li, Quanhao, et al.
Published: (2026)
Scaling Embedding Layers in Language Models
by: Yu, Da, et al.
Published: (2025)
by: Yu, Da, et al.
Published: (2025)
LOST: Low-rank and Sparse Pre-training for Large Language Models
by: Li, Jiaxi, et al.
Published: (2025)
by: Li, Jiaxi, et al.
Published: (2025)
A Probabilistic Perspective on Model Collapse
by: Xu, Shirong, et al.
Published: (2025)
by: Xu, Shirong, et al.
Published: (2025)
On the Robustness of Neural Collapse and the Neural Collapse of Robustness
by: Su, Jingtong, et al.
Published: (2023)
by: Su, Jingtong, et al.
Published: (2023)
Reward Collapse in Aligning Large Language Models
by: Song, Ziang, et al.
Published: (2023)
by: Song, Ziang, et al.
Published: (2023)
How to Unlock Time Series Editing? Diffusion-Driven Approach with Multi-Grained Control
by: Yu, Hao, et al.
Published: (2025)
by: Yu, Hao, et al.
Published: (2025)
Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
by: Zhu, Yongxin, et al.
Published: (2024)
by: Zhu, Yongxin, et al.
Published: (2024)
Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model
by: Choi, Joo Young, et al.
Published: (2024)
by: Choi, Joo Young, et al.
Published: (2024)
It's Never Too Late: Noise Optimization for Collapse Recovery in Trained Diffusion Models
by: Harrington, Anne, et al.
Published: (2025)
by: Harrington, Anne, et al.
Published: (2025)
L$^3$: Large Lookup Layers
by: Tseng, Albert, et al.
Published: (2026)
by: Tseng, Albert, et al.
Published: (2026)
Similar Items
-
Training Dynamics Impact Post-Training Quantization Robustness
by: Catalan-Tatjer, Albert, et al.
Published: (2025) -
Reducing Storage of Pretrained Neural Networks by Rate-Constrained Quantization and Entropy Coding
by: Conzelmann, Alexander, et al.
Published: (2025) -
Decentralized Task Offloading and Load-Balancing for Mobile Edge Computing in Dense Networks
by: Yahya, Mariam, et al.
Published: (2024) -
AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
by: Lu, Haiquan, et al.
Published: (2024) -
Quantifying Error Propagation and Model Collapse in Diffusion Models
by: Khelifa, Nail B., et al.
Published: (2026)