Saved in:
| Main Authors: | Deng, Zhuoran, Zhang, Yizhi, Zhang, Ziyi, Shen, Wan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.09476 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Training Multimodal Large Reasoning Models Needs Better Thoughts: A Three-Stage Framework for Long Chain-of-Thought Synthesis and Selection
by: Wang, Yizhi, et al.
Published: (2025)
by: Wang, Yizhi, et al.
Published: (2025)
Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need
by: Peng, Sijia, et al.
Published: (2024)
by: Peng, Sijia, et al.
Published: (2024)
Synthetic Data RL: Task Definition Is All You Need
by: Guo, Yiduo, et al.
Published: (2025)
by: Guo, Yiduo, et al.
Published: (2025)
What Matters in Transformers? Not All Attention is Needed
by: He, Shwai, et al.
Published: (2024)
by: He, Shwai, et al.
Published: (2024)
Language is All a Graph Needs
by: Ye, Ruosong, et al.
Published: (2023)
by: Ye, Ruosong, et al.
Published: (2023)
More Agents Is All You Need
by: Li, Junyou, et al.
Published: (2024)
by: Li, Junyou, et al.
Published: (2024)
Ideal Registration? Segmentation is All You Need
by: Chen, Xiang, et al.
Published: (2025)
by: Chen, Xiang, et al.
Published: (2025)
A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals
by: Liu, Grace, et al.
Published: (2024)
by: Liu, Grace, et al.
Published: (2024)
Not All Documents Are What You Need for Extracting Instruction Tuning Data
by: Zhang, Chi, et al.
Published: (2025)
by: Zhang, Chi, et al.
Published: (2025)
Common Sense Is All You Need
by: Latapie, Hugo
Published: (2025)
by: Latapie, Hugo
Published: (2025)
Attention Is All You Need for KV Cache in Diffusion LLMs
by: Nguyen-Tri, Quan, et al.
Published: (2025)
by: Nguyen-Tri, Quan, et al.
Published: (2025)
Not All Layers Need Tuning: Selective Layer Restoration Recovers Diversity
by: Zhang, Bowen, et al.
Published: (2026)
by: Zhang, Bowen, et al.
Published: (2026)
Rho-1: Not All Tokens Are What You Need
by: Lin, Zhenghao, et al.
Published: (2024)
by: Lin, Zhenghao, et al.
Published: (2024)
Context is All You Need
by: Delanois, Jean Erik, et al.
Published: (2026)
by: Delanois, Jean Erik, et al.
Published: (2026)
Contrast Is All You Need
by: Kilic, Burak, et al.
Published: (2023)
by: Kilic, Burak, et al.
Published: (2023)
Bridging Efficiency and Transparency: Explainable CoT Compression in Multimodal Large Reasoning Models
by: Wang, Yizhi, et al.
Published: (2026)
by: Wang, Yizhi, et al.
Published: (2026)
The Residual Stream Is All You Need: On the Redundancy of the KV Cache in Transformer Inference
by: Qasim, Kaleem Ullah, et al.
Published: (2026)
by: Qasim, Kaleem Ullah, et al.
Published: (2026)
Goal-Conditioned Agents that Learn Everything All at Once
by: Matthews, Michael, et al.
Published: (2026)
by: Matthews, Michael, et al.
Published: (2026)
Are Tools All We Need? Unveiling the Tool-Use Tax in LLM Agents
by: Zhang, Kaituo, et al.
Published: (2026)
by: Zhang, Kaituo, et al.
Published: (2026)
Information Gain Is Not All You Need
by: Ericson, Ludvig, et al.
Published: (2025)
by: Ericson, Ludvig, et al.
Published: (2025)
Memory augment is All You Need for image restoration
by: Zhang, Xiao Feng, et al.
Published: (2023)
by: Zhang, Xiao Feng, et al.
Published: (2023)
Avoiding Premature Collapse: Adaptive Annealing for Entropy-Regularized Structural Inference
by: Liu, Yizhi
Published: (2026)
by: Liu, Yizhi
Published: (2026)
The Homogeneity Trap: Spectral Collapse in Doubly-Stochastic Deep Networks
by: Liu, Yizhi
Published: (2026)
by: Liu, Yizhi
Published: (2026)
Tensor Product Attention Is All You Need
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts
by: Gan, Chunjing, et al.
Published: (2024)
by: Gan, Chunjing, et al.
Published: (2024)
Attention is All You Need Until You Need Retention
by: Yaslioglu, M. Murat
Published: (2025)
by: Yaslioglu, M. Murat
Published: (2025)
TransMLA: Multi-Head Latent Attention Is All You Need
by: Meng, Fanxu, et al.
Published: (2025)
by: Meng, Fanxu, et al.
Published: (2025)
Exploitation Is All You Need... for Exploration
by: Rentschler, Micah, et al.
Published: (2025)
by: Rentschler, Micah, et al.
Published: (2025)
Were RNNs All We Needed?
by: Feng, Leo, et al.
Published: (2024)
by: Feng, Leo, et al.
Published: (2024)
Training on the Benchmark Is Not All You Need
by: Ni, Shiwen, et al.
Published: (2024)
by: Ni, Shiwen, et al.
Published: (2024)
Towards Goal-oriented Intelligent Tutoring Systems in Online Education
by: Deng, Yang, et al.
Published: (2023)
by: Deng, Yang, et al.
Published: (2023)
Reasoning Is All You Need for Urban Planning AI
by: Yang, Sijie, et al.
Published: (2025)
by: Yang, Sijie, et al.
Published: (2025)
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
by: You, Kang, et al.
Published: (2024)
by: You, Kang, et al.
Published: (2024)
Self-supervised Dataset Distillation: A Good Compression Is All You Need
by: Zhou, Muxin, et al.
Published: (2024)
by: Zhou, Muxin, et al.
Published: (2024)
[MASK] is All You Need
by: Hu, Vincent Tao, et al.
Published: (2024)
by: Hu, Vincent Tao, et al.
Published: (2024)
COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning
by: Bai, Yuelin, et al.
Published: (2024)
by: Bai, Yuelin, et al.
Published: (2024)
Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM
by: Lu, Xiaoding, et al.
Published: (2024)
by: Lu, Xiaoding, et al.
Published: (2024)
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4
by: Bsharat, Sondos Mahmoud, et al.
Published: (2023)
by: Bsharat, Sondos Mahmoud, et al.
Published: (2023)
All You Need Is Synthetic Task Augmentation
by: Godin, Guillaume
Published: (2025)
by: Godin, Guillaume
Published: (2025)
Element-wise Attention Is All You Need
by: Feng, Guoxin
Published: (2025)
by: Feng, Guoxin
Published: (2025)
Similar Items
-
Training Multimodal Large Reasoning Models Needs Better Thoughts: A Three-Stage Framework for Long Chain-of-Thought Synthesis and Selection
by: Wang, Yizhi, et al.
Published: (2025) -
Mamba or Transformer for Time Series Forecasting? Mixture of Universals (MoU) Is All You Need
by: Peng, Sijia, et al.
Published: (2024) -
Synthetic Data RL: Task Definition Is All You Need
by: Guo, Yiduo, et al.
Published: (2025) -
What Matters in Transformers? Not All Attention is Needed
by: He, Shwai, et al.
Published: (2024) -
Language is All a Graph Needs
by: Ye, Ruosong, et al.
Published: (2023)