Saved in:
| Main Authors: | Wang, Jiapeng, Hu, Yiwen, Gao, Yanzipeng, Wang, Haoyu, Wang, Shuo, Lu, Hongyu, Mao, Jiaxin, Zhao, Wayne Xin, Li, Junyi, Zhang, Xiao |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.23422 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MergeMix: Optimizing Mid-Training Data Mixtures via Learnable Model Merging
by: Wang, Jiapeng, et al.
Published: (2026)
by: Wang, Jiapeng, et al.
Published: (2026)
From IDs to Semantics: A Generative Framework for Cross-Domain Recommendation with Adaptive Semantic Tokenization
by: Hu, Peiyu, et al.
Published: (2025)
by: Hu, Peiyu, et al.
Published: (2025)
Experience-Guided Reflective Co-Evolution of Prompts and Heuristics for Automatic Algorithm Design
by: Liu, Yihong, et al.
Published: (2025)
by: Liu, Yihong, et al.
Published: (2025)
An Integrated Data Processing Framework for Pretraining Foundation Models
by: Sun, Yiding, et al.
Published: (2024)
by: Sun, Yiding, et al.
Published: (2024)
MaP: A Unified Framework for Reliable Evaluation of Pre-training Dynamics
by: Wang, Jiapeng, et al.
Published: (2025)
by: Wang, Jiapeng, et al.
Published: (2025)
RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library
by: Wang, Jiapeng, et al.
Published: (2025)
by: Wang, Jiapeng, et al.
Published: (2025)
Universal Item Tokenization for Transferable Generative Recommendation
by: Zheng, Bowen, et al.
Published: (2025)
by: Zheng, Bowen, et al.
Published: (2025)
Entropy-Guided k-Guard Sampling for Long-Horizon Autoregressive Video Generation
by: Han, Yizhao, et al.
Published: (2026)
by: Han, Yizhao, et al.
Published: (2026)
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models
by: Zhou, Kun, et al.
Published: (2024)
by: Zhou, Kun, et al.
Published: (2024)
YuLan-Mini: An Open Data-efficient Language Model
by: Hu, Yiwen, et al.
Published: (2024)
by: Hu, Yiwen, et al.
Published: (2024)
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models
by: Gao, Kaifeng, et al.
Published: (2024)
by: Gao, Kaifeng, et al.
Published: (2024)
Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding
by: Jing, Zihao, et al.
Published: (2026)
by: Jing, Zihao, et al.
Published: (2026)
WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training
by: Tian, Changxin, et al.
Published: (2025)
by: Tian, Changxin, et al.
Published: (2025)
Controlled LLM Training on Spectral Sphere
by: Xie, Tian, et al.
Published: (2026)
by: Xie, Tian, et al.
Published: (2026)
Training-free Dropout Sampling for Semantic Token Acceptance in Speculative Decoding
by: Lee, Jeongtae, et al.
Published: (2026)
by: Lee, Jeongtae, et al.
Published: (2026)
Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA
by: Chen, Hao Mark, et al.
Published: (2024)
by: Chen, Hao Mark, et al.
Published: (2024)
Holistically Guided Monte Carlo Tree Search for Intricate Information Seeking
by: Ren, Ruiyang, et al.
Published: (2025)
by: Ren, Ruiyang, et al.
Published: (2025)
Training-Free Text-Guided Image Editing with Visual Autoregressive Model
by: Wang, Yufei, et al.
Published: (2025)
by: Wang, Yufei, et al.
Published: (2025)
REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering
by: Wang, Yuhao, et al.
Published: (2024)
by: Wang, Yuhao, et al.
Published: (2024)
Selftok: Discrete Visual Tokens of Autoregression, by Diffusion, and for Reasoning
by: Wang, Bohan, et al.
Published: (2025)
by: Wang, Bohan, et al.
Published: (2025)
BitDance: Scaling Autoregressive Generative Models with Binary Tokens
by: Ai, Yuang, et al.
Published: (2026)
by: Ai, Yuang, et al.
Published: (2026)
Not All Tokens are Guided Equal: Improving Guidance in Visual Autoregressive Models
by: Nguyen, Ky Dan, et al.
Published: (2025)
by: Nguyen, Ky Dan, et al.
Published: (2025)
Progressive Data Dropout: An Embarrassingly Simple Approach to Faster Training
by: Sathiyanarayanan, Shriram M, et al.
Published: (2025)
by: Sathiyanarayanan, Shriram M, et al.
Published: (2025)
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
by: Gao, Kaifeng, et al.
Published: (2024)
by: Gao, Kaifeng, et al.
Published: (2024)
ForesightKV: Optimizing KV Cache Eviction for Reasoning Models by Learning Long-Term Contribution
by: Dong, Zican, et al.
Published: (2026)
by: Dong, Zican, et al.
Published: (2026)
From Trial-and-Error to Improvement: A Systematic Analysis of LLM Exploration Mechanisms in RLVR
by: Deng, Jia, et al.
Published: (2025)
by: Deng, Jia, et al.
Published: (2025)
Entropy-Guided Data-Efficient Training for Multimodal Reasoning Reward Models
by: Yang, Shidong, et al.
Published: (2026)
by: Yang, Shidong, et al.
Published: (2026)
TTPA: Token-level Tool-use Preference Alignment Training Framework with Fine-grained Evaluation
by: Huang, Chengrui, et al.
Published: (2025)
by: Huang, Chengrui, et al.
Published: (2025)
Adaptive Begin-of-Video Tokens for Autoregressive Video Diffusion Models
by: Cheng, Tianle, et al.
Published: (2025)
by: Cheng, Tianle, et al.
Published: (2025)
Improving Autoregressive Training with Dynamic Oracles
by: Yang, Jianing, et al.
Published: (2024)
by: Yang, Jianing, et al.
Published: (2024)
Learning Domain-Invariant Representations for Cross-Domain Image Registration via Scene-Appearance Disentanglement
by: Qin, Jiahao, et al.
Published: (2026)
by: Qin, Jiahao, et al.
Published: (2026)
Hita: Holistic Tokenizer for Autoregressive Image Generation
by: Zheng, Anlin, et al.
Published: (2025)
by: Zheng, Anlin, et al.
Published: (2025)
Token Painter: Training-Free Text-Guided Image Inpainting via Mask Autoregressive Models
by: Jiang, Longtao, et al.
Published: (2025)
by: Jiang, Longtao, et al.
Published: (2025)
CURE: Critical-Token-Guided Re-Concatenation for Entropy-Collapse Prevention
by: Li, Qingbin, et al.
Published: (2025)
by: Li, Qingbin, et al.
Published: (2025)
Autoregressive Pre-Training on Pixels and Texts
by: Chai, Yekun, et al.
Published: (2024)
by: Chai, Yekun, et al.
Published: (2024)
Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation
by: Yue, Xiaoyu, et al.
Published: (2025)
by: Yue, Xiaoyu, et al.
Published: (2025)
LRCP: Low-Rank Compressibility Guided Visual Token Pruning for Efficient LVLMs
by: Lu, Hongyu, et al.
Published: (2026)
by: Lu, Hongyu, et al.
Published: (2026)
CLAQ: Pushing the Limits of Low-Bit Post-Training Quantization for LLMs
by: Wang, Haoyu, et al.
Published: (2024)
by: Wang, Haoyu, et al.
Published: (2024)
V-Express: Conditional Dropout for Progressive Training of Portrait Video Generation
by: Wang, Cong, et al.
Published: (2024)
by: Wang, Cong, et al.
Published: (2024)
Bayesian Analysis for a Threshold Double Autoregressive Model With Explanatory Variables
by: Han Li, et al.
Published: (2025)
by: Han Li, et al.
Published: (2025)
Similar Items
-
MergeMix: Optimizing Mid-Training Data Mixtures via Learnable Model Merging
by: Wang, Jiapeng, et al.
Published: (2026) -
From IDs to Semantics: A Generative Framework for Cross-Domain Recommendation with Adaptive Semantic Tokenization
by: Hu, Peiyu, et al.
Published: (2025) -
Experience-Guided Reflective Co-Evolution of Prompts and Heuristics for Automatic Algorithm Design
by: Liu, Yihong, et al.
Published: (2025) -
An Integrated Data Processing Framework for Pretraining Foundation Models
by: Sun, Yiding, et al.
Published: (2024) -
MaP: A Unified Framework for Reliable Evaluation of Pre-training Dynamics
by: Wang, Jiapeng, et al.
Published: (2025)