Saved in:
| Main Authors: | Qi, Xianbiao, Chen, Marco, Ye, Jiaquan, He, Yelin, Xiao, Rong |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.04669 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SimpleGPT: Improving GPT via A Simple Normalization Strategy
by: Chen, Marco, et al.
Published: (2026)
by: Chen, Marco, et al.
Published: (2026)
DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD
by: Qi, Xianbiao, et al.
Published: (2025)
by: Qi, Xianbiao, et al.
Published: (2025)
HCAttention: Extreme KV Cache Compression via Heterogeneous Attention Computing for LLMs
by: Yang, Dongquan, et al.
Published: (2025)
by: Yang, Dongquan, et al.
Published: (2025)
Muon is Scalable for LLM Training
by: Liu, Jingyuan, et al.
Published: (2025)
by: Liu, Jingyuan, et al.
Published: (2025)
Taming Transformer Without Using Learning Rate Warmup
by: Qi, Xianbiao, et al.
Published: (2025)
by: Qi, Xianbiao, et al.
Published: (2025)
Mousse: Rectifying the Geometry of Muon with Curvature-Aware Preconditioning
by: Zhang, Yechen, et al.
Published: (2026)
by: Zhang, Yechen, et al.
Published: (2026)
Why Does ChatGPT "Delve" So Much? Exploring the Sources of Lexical Overrepresentation in Large Language Models
by: Juzek, Tom S., et al.
Published: (2024)
by: Juzek, Tom S., et al.
Published: (2024)
DeepCritic: Deliberate Critique with Large Language Models
by: Yang, Wenkai, et al.
Published: (2025)
by: Yang, Wenkai, et al.
Published: (2025)
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
by: Tong, Chengzhuo, et al.
Published: (2025)
by: Tong, Chengzhuo, et al.
Published: (2025)
Real Deep Research for AI, Robotics and Beyond
by: Zou, Xueyan, et al.
Published: (2025)
by: Zou, Xueyan, et al.
Published: (2025)
An Analysis and Mitigation of the Reversal Curse
by: Lv, Ang, et al.
Published: (2023)
by: Lv, Ang, et al.
Published: (2023)
Frac-Connections: Fractional Extension of Hyper-Connections
by: Zhu, Defa, et al.
Published: (2025)
by: Zhu, Defa, et al.
Published: (2025)
Neutral Residues: Revisiting Adapters for Model Extension
by: Talla, Franck Signe, et al.
Published: (2024)
by: Talla, Franck Signe, et al.
Published: (2024)
Beyond Isolation: Multi-Agent Synergy for Improving Knowledge Graph Construction
by: Ye, Hongbin, et al.
Published: (2023)
by: Ye, Hongbin, et al.
Published: (2023)
Benchmarking for Domain-Specific LLMs: A Case Study on Academia and Beyond
by: Chen, Rubing, et al.
Published: (2025)
by: Chen, Rubing, et al.
Published: (2025)
DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments
by: Zheng, Yuxiang, et al.
Published: (2025)
by: Zheng, Yuxiang, et al.
Published: (2025)
RHYTHM: Reasoning with Hierarchical Temporal Tokenization for Human Mobility
by: He, Haoyu, et al.
Published: (2025)
by: He, Haoyu, et al.
Published: (2025)
Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
by: Gunjal, Anisha, et al.
Published: (2025)
by: Gunjal, Anisha, et al.
Published: (2025)
Beyond Scalar Reward Model: Learning Generative Judge from Preference Data
by: Ye, Ziyi, et al.
Published: (2024)
by: Ye, Ziyi, et al.
Published: (2024)
Training-Free Exponential Context Extension via Cascading KV Cache
by: Willette, Jeffrey, et al.
Published: (2024)
by: Willette, Jeffrey, et al.
Published: (2024)
YaRN: Efficient Context Window Extension of Large Language Models
by: Peng, Bowen, et al.
Published: (2023)
by: Peng, Bowen, et al.
Published: (2023)
CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models
by: Chen, Zhuofan, et al.
Published: (2025)
by: Chen, Zhuofan, et al.
Published: (2025)
A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration
by: Cui, Yingqian, et al.
Published: (2024)
by: Cui, Yingqian, et al.
Published: (2024)
CORI: CJKV Benchmark with Romanization Integration -- A step towards Cross-lingual Transfer Beyond Textual Scripts
by: Nguyen, Hoang H., et al.
Published: (2024)
by: Nguyen, Hoang H., et al.
Published: (2024)
Beyond the Singular: Revealing the Value of Multiple Generations in Benchmark Evaluation
by: Zhang, Wenbo, et al.
Published: (2025)
by: Zhang, Wenbo, et al.
Published: (2025)
Beyond Correlation: Refutation-Validated Aspect-Based Sentiment Analysis for Explainable Energy Market Returns
by: van der Heever, Wihan, et al.
Published: (2026)
by: van der Heever, Wihan, et al.
Published: (2026)
DynamixSFT: Dynamic Mixture Optimization of Instruction Tuning Collections
by: Shin, Haebin, et al.
Published: (2025)
by: Shin, Haebin, et al.
Published: (2025)
SpanNorm: Reconciling Training Stability and Performance in Deep Transformers
by: Wang, Chao, et al.
Published: (2026)
by: Wang, Chao, et al.
Published: (2026)
Beyond Introspection: Reinforcing Thinking via Externalist Behavioral Feedback
by: Yang, Diji, et al.
Published: (2024)
by: Yang, Diji, et al.
Published: (2024)
Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed
by: Fu, Yonggan, et al.
Published: (2025)
by: Fu, Yonggan, et al.
Published: (2025)
Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding
by: Yi, Hanling, et al.
Published: (2024)
by: Yi, Hanling, et al.
Published: (2024)
NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
by: Liu, Wei, et al.
Published: (2025)
by: Liu, Wei, et al.
Published: (2025)
TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement
by: He, Haoyang, et al.
Published: (2026)
by: He, Haoyang, et al.
Published: (2026)
Beyond Confidence: Rethinking Self-Assessments for Performance Prediction in LLMs
by: Bhattacharyya, Sree, et al.
Published: (2026)
by: Bhattacharyya, Sree, et al.
Published: (2026)
Enhancing Quantitative Reasoning Skills of Large Language Models through Dimension Perception
by: Huang, Yuncheng, et al.
Published: (2023)
by: Huang, Yuncheng, et al.
Published: (2023)
Deep Dense Exploration for LLM Reinforcement Learning via Pivot-Driven Resampling
by: Guo, Yiran, et al.
Published: (2026)
by: Guo, Yiran, et al.
Published: (2026)
ChemAmp: Amplified Chemistry Tools via Composable Agents
by: Li, Zhucong, et al.
Published: (2025)
by: Li, Zhucong, et al.
Published: (2025)
SLOT: Sample-specific Language Model Optimization at Test-time
by: Hu, Yang, et al.
Published: (2025)
by: Hu, Yang, et al.
Published: (2025)
GoRA: Gradient-driven Adaptive Low Rank Adaptation
by: He, Haonan, et al.
Published: (2025)
by: He, Haonan, et al.
Published: (2025)
DeepOnto: A Python Package for Ontology Engineering with Deep Learning
by: He, Yuan, et al.
Published: (2023)
by: He, Yuan, et al.
Published: (2023)
Similar Items
-
SimpleGPT: Improving GPT via A Simple Normalization Strategy
by: Chen, Marco, et al.
Published: (2026) -
DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD
by: Qi, Xianbiao, et al.
Published: (2025) -
HCAttention: Extreme KV Cache Compression via Heterogeneous Attention Computing for LLMs
by: Yang, Dongquan, et al.
Published: (2025) -
Muon is Scalable for LLM Training
by: Liu, Jingyuan, et al.
Published: (2025) -
Taming Transformer Without Using Learning Rate Warmup
by: Qi, Xianbiao, et al.
Published: (2025)