:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Qi, Xianbiao, Chen, Marco, Ye, Jiaquan, He, Yelin, Xiao, Rong
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2602.04669
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SimpleGPT: Improving GPT via A Simple Normalization Strategy
by: Chen, Marco, et al.
Published: (2026)

DNT: a Deeply Normalized Transformer that can be trained by Momentum SGD
by: Qi, Xianbiao, et al.
Published: (2025)

HCAttention: Extreme KV Cache Compression via Heterogeneous Attention Computing for LLMs
by: Yang, Dongquan, et al.
Published: (2025)

Muon is Scalable for LLM Training
by: Liu, Jingyuan, et al.
Published: (2025)

Taming Transformer Without Using Learning Rate Warmup
by: Qi, Xianbiao, et al.
Published: (2025)

Mousse: Rectifying the Geometry of Muon with Curvature-Aware Preconditioning
by: Zhang, Yechen, et al.
Published: (2026)

Why Does ChatGPT "Delve" So Much? Exploring the Sources of Lexical Overrepresentation in Large Language Models
by: Juzek, Tom S., et al.
Published: (2024)

DeepCritic: Deliberate Critique with Large Language Models
by: Yang, Wenkai, et al.
Published: (2025)

Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
by: Tong, Chengzhuo, et al.
Published: (2025)

Real Deep Research for AI, Robotics and Beyond
by: Zou, Xueyan, et al.
Published: (2025)

An Analysis and Mitigation of the Reversal Curse
by: Lv, Ang, et al.
Published: (2023)

Frac-Connections: Fractional Extension of Hyper-Connections
by: Zhu, Defa, et al.
Published: (2025)

Neutral Residues: Revisiting Adapters for Model Extension
by: Talla, Franck Signe, et al.
Published: (2024)

Beyond Isolation: Multi-Agent Synergy for Improving Knowledge Graph Construction
by: Ye, Hongbin, et al.
Published: (2023)

Benchmarking for Domain-Specific LLMs: A Case Study on Academia and Beyond
by: Chen, Rubing, et al.
Published: (2025)

DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments
by: Zheng, Yuxiang, et al.
Published: (2025)

RHYTHM: Reasoning with Hierarchical Temporal Tokenization for Human Mobility
by: He, Haoyu, et al.
Published: (2025)

Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
by: Gunjal, Anisha, et al.
Published: (2025)

Beyond Scalar Reward Model: Learning Generative Judge from Preference Data
by: Ye, Ziyi, et al.
Published: (2024)

Training-Free Exponential Context Extension via Cascading KV Cache
by: Willette, Jeffrey, et al.
Published: (2024)

YaRN: Efficient Context Window Extension of Large Language Models
by: Peng, Bowen, et al.
Published: (2023)

CogAtom: From Cognitive Atoms to Olympiad-level Mathematical Reasoning in Large Language Models
by: Chen, Zhuofan, et al.
Published: (2025)

A Theoretical Understanding of Chain-of-Thought: Coherent Reasoning and Error-Aware Demonstration
by: Cui, Yingqian, et al.
Published: (2024)

CORI: CJKV Benchmark with Romanization Integration -- A step towards Cross-lingual Transfer Beyond Textual Scripts
by: Nguyen, Hoang H., et al.
Published: (2024)

Beyond the Singular: Revealing the Value of Multiple Generations in Benchmark Evaluation
by: Zhang, Wenbo, et al.
Published: (2025)

Beyond Correlation: Refutation-Validated Aspect-Based Sentiment Analysis for Explainable Energy Market Returns
by: van der Heever, Wihan, et al.
Published: (2026)

DynamixSFT: Dynamic Mixture Optimization of Instruction Tuning Collections
by: Shin, Haebin, et al.
Published: (2025)

SpanNorm: Reconciling Training Stability and Performance in Deep Transformers
by: Wang, Chao, et al.
Published: (2026)

Beyond Introspection: Reinforcing Thinking via Externalist Behavioral Feedback
by: Yang, Diji, et al.
Published: (2024)

Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed
by: Fu, Yonggan, et al.
Published: (2025)

Generation Meets Verification: Accelerating Large Language Model Inference with Smart Parallel Auto-Correct Decoding
by: Yi, Hanling, et al.
Published: (2024)

NOVER: Incentive Training for Language Models via Verifier-Free Reinforcement Learning
by: Liu, Wei, et al.
Published: (2025)

TTSR: Test-Time Self-Reflection for Continual Reasoning Improvement
by: He, Haoyang, et al.
Published: (2026)

Beyond Confidence: Rethinking Self-Assessments for Performance Prediction in LLMs
by: Bhattacharyya, Sree, et al.
Published: (2026)

Enhancing Quantitative Reasoning Skills of Large Language Models through Dimension Perception
by: Huang, Yuncheng, et al.
Published: (2023)

Deep Dense Exploration for LLM Reinforcement Learning via Pivot-Driven Resampling
by: Guo, Yiran, et al.
Published: (2026)

ChemAmp: Amplified Chemistry Tools via Composable Agents
by: Li, Zhucong, et al.
Published: (2025)

SLOT: Sample-specific Language Model Optimization at Test-time
by: Hu, Yang, et al.
Published: (2025)

GoRA: Gradient-driven Adaptive Low Rank Adaptation
by: He, Haonan, et al.
Published: (2025)

DeepOnto: A Python Package for Ontology Engineering with Deep Learning
by: He, Yuan, et al.
Published: (2023)