Saved in:
| Main Authors: | Yin, Shangjian, Wei, Zhepei, Zhu, Xinyu, Chen, Wei-Lin, Meng, Yu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.06652 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
by: Wei, Zhepei, et al.
Published: (2024)
by: Wei, Zhepei, et al.
Published: (2024)
AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism
by: Wei, Zhepei, et al.
Published: (2025)
by: Wei, Zhepei, et al.
Published: (2025)
Do LLM Evaluators Prefer Themselves for a Reason?
by: Chen, Wei-Lin, et al.
Published: (2025)
by: Chen, Wei-Lin, et al.
Published: (2025)
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories
by: Wei, Zhepei, et al.
Published: (2026)
by: Wei, Zhepei, et al.
Published: (2026)
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning
by: Zhu, Xinyu, et al.
Published: (2025)
by: Zhu, Xinyu, et al.
Published: (2025)
ECLM: Entity Level Language Model for Spoken Language Understanding with Chain of Intent
by: Yin, Shangjian, et al.
Published: (2024)
by: Yin, Shangjian, et al.
Published: (2024)
Icon$^{2}$: Aligning Large Language Models Using Self-Synthetic Preference Data via Inherent Regulation
by: Chen, Qiyuan, et al.
Published: (2025)
by: Chen, Qiyuan, et al.
Published: (2025)
Self-Boosting Large Language Models with Synthetic Preference Data
by: Dong, Qingxiu, et al.
Published: (2024)
by: Dong, Qingxiu, et al.
Published: (2024)
PIKA: Expert-Level Synthetic Datasets for Post-Training Alignment from Scratch
by: Yin, Shangjian, et al.
Published: (2025)
by: Yin, Shangjian, et al.
Published: (2025)
IterAlign: Iterative Constitutional Alignment of Large Language Models
by: Chen, Xiusi, et al.
Published: (2024)
by: Chen, Xiusi, et al.
Published: (2024)
Physio-DPO: Aligning Large Language Models with the Protein Energy Landscape to Eliminate Structural Hallucinations
by: Meng, QiWei
Published: (2026)
by: Meng, QiWei
Published: (2026)
GRLO: Towards Generalizable Reinforcement Learning in Open-Ended Environments from Zero
by: Yin, Shangjian, et al.
Published: (2026)
by: Yin, Shangjian, et al.
Published: (2026)
CodecLM: Aligning Language Models with Tailored Synthetic Data
by: Wang, Zifeng, et al.
Published: (2024)
by: Wang, Zifeng, et al.
Published: (2024)
AdaSearch: Balancing Parametric Knowledge and Search in Large Language Models via Reinforcement Learning
by: Lin, Tzu-Han, et al.
Published: (2025)
by: Lin, Tzu-Han, et al.
Published: (2025)
Exploring Mathematical Extrapolation of Large Language Models with Synthetic Data
by: Li, Haolong, et al.
Published: (2024)
by: Li, Haolong, et al.
Published: (2024)
G-Zero: Self-Play for Open-Ended Generation from Zero Data
by: Huang, Chengsong, et al.
Published: (2026)
by: Huang, Chengsong, et al.
Published: (2026)
FactAlign: Long-form Factuality Alignment of Large Language Models
by: Huang, Chao-Wei, et al.
Published: (2024)
by: Huang, Chao-Wei, et al.
Published: (2024)
Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models
by: Li, Jiatao, et al.
Published: (2024)
by: Li, Jiatao, et al.
Published: (2024)
Latent Preference Coding: Aligning Large Language Models via Discrete Latent Codes
by: Gong, Zhuocheng, et al.
Published: (2025)
by: Gong, Zhuocheng, et al.
Published: (2025)
Are Large Language Models Good In-context Learners for Financial Sentiment Analysis?
by: Wei, Xinyu, et al.
Published: (2025)
by: Wei, Xinyu, et al.
Published: (2025)
Dynamic Noise Preference Optimization: Self-Improvement of Large Language Models with Self-Synthetic Data
by: Yang, Haoyan, et al.
Published: (2025)
by: Yang, Haoyan, et al.
Published: (2025)
CHIMERA: Compact Synthetic Data for Generalizable LLM Reasoning
by: Zhu, Xinyu, et al.
Published: (2026)
by: Zhu, Xinyu, et al.
Published: (2026)
Aligning Large Language Models with Searcher Preferences
by: Wu, Wei, et al.
Published: (2026)
by: Wu, Wei, et al.
Published: (2026)
Aligning Large Language Models by On-Policy Self-Judgment
by: Lee, Sangkyu, et al.
Published: (2024)
by: Lee, Sangkyu, et al.
Published: (2024)
Understanding Synthetic Context Extension via Retrieval Heads
by: Zhao, Xinyu, et al.
Published: (2024)
by: Zhao, Xinyu, et al.
Published: (2024)
Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data
by: Zou, Wei, et al.
Published: (2025)
by: Zou, Wei, et al.
Published: (2025)
DataGen: Unified Synthetic Dataset Generation via Large Language Models
by: Huang, Yue, et al.
Published: (2024)
by: Huang, Yue, et al.
Published: (2024)
Separate the Wheat from the Chaff: Winnowing Down Divergent Views in Retrieval Augmented Generation
by: Wang, Song, et al.
Published: (2025)
by: Wang, Song, et al.
Published: (2025)
Self-Rewarding PPO: Aligning Large Language Models with Demonstrations Only
by: Zhang, Qingru, et al.
Published: (2025)
by: Zhang, Qingru, et al.
Published: (2025)
ResoFilter: Fine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance Analysis
by: Tu, Zeao, et al.
Published: (2024)
by: Tu, Zeao, et al.
Published: (2024)
Evaluating Large Language Models as Expert Annotators
by: Tseng, Yu-Min, et al.
Published: (2025)
by: Tseng, Yu-Min, et al.
Published: (2025)
OntoTune: Ontology-Driven Self-training for Aligning Large Language Models
by: Liu, Zhiqiang, et al.
Published: (2025)
by: Liu, Zhiqiang, et al.
Published: (2025)
Aligning Large Language Models with Implicit Preferences from User-Generated Content
by: Tan, Zhaoxuan, et al.
Published: (2025)
by: Tan, Zhaoxuan, et al.
Published: (2025)
CM-Align: Consistency-based Multilingual Alignment for Large Language Models
by: Zhang, Xue, et al.
Published: (2025)
by: Zhang, Xue, et al.
Published: (2025)
Control Large Language Models via Divide and Conquer
by: Li, Bingxuan, et al.
Published: (2024)
by: Li, Bingxuan, et al.
Published: (2024)
Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
by: Wang, Hongyu, et al.
Published: (2024)
by: Wang, Hongyu, et al.
Published: (2024)
Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering
by: Si, Shuzheng, et al.
Published: (2025)
by: Si, Shuzheng, et al.
Published: (2025)
Enhancing Large Vision Language Models with Self-Training on Image Comprehension
by: Deng, Yihe, et al.
Published: (2024)
by: Deng, Yihe, et al.
Published: (2024)
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
by: Wei, Fangyun, et al.
Published: (2024)
by: Wei, Fangyun, et al.
Published: (2024)
On the Diversity of Synthetic Data and its Impact on Training Large Language Models
by: Chen, Hao, et al.
Published: (2024)
by: Chen, Hao, et al.
Published: (2024)
Similar Items
-
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
by: Wei, Zhepei, et al.
Published: (2024) -
AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism
by: Wei, Zhepei, et al.
Published: (2025) -
Do LLM Evaluators Prefer Themselves for a Reason?
by: Chen, Wei-Lin, et al.
Published: (2025) -
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories
by: Wei, Zhepei, et al.
Published: (2026) -
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning
by: Zhu, Xinyu, et al.
Published: (2025)