Saved in:
| Main Authors: | Yao, Wei, Yang, Wenkai, Xu, Gengze, Wang, Ziqiao, Lin, Yankai, Liu, Yong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.01458 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On Weak-to-Strong Generalization and f-Divergence
by: Yao, Wei, et al.
Published: (2025)
by: Yao, Wei, et al.
Published: (2025)
On the Emergence of Weak-to-Strong Generalization: A Bias-Variance Perspective
by: Xu, Gengze, et al.
Published: (2025)
by: Xu, Gengze, et al.
Published: (2025)
Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL
by: Yao, Wei, et al.
Published: (2025)
by: Yao, Wei, et al.
Published: (2025)
On the Blessing of Pre-training in Weak-to-Strong Generalization
by: Yao, Wei, et al.
Published: (2026)
by: Yao, Wei, et al.
Published: (2026)
Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
by: Yao, Xinhao, et al.
Published: (2024)
by: Yao, Xinhao, et al.
Published: (2024)
Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
by: Yang, Wenkai, et al.
Published: (2024)
by: Yang, Wenkai, et al.
Published: (2024)
Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation
by: Yang, Wenkai, et al.
Published: (2026)
by: Yang, Wenkai, et al.
Published: (2026)
On the Limitations and Capabilities of Position Embeddings for Length Generalization
by: Chen, Yang, et al.
Published: (2025)
by: Chen, Yang, et al.
Published: (2025)
DeepCritic: Deliberate Critique with Large Language Models
by: Yang, Wenkai, et al.
Published: (2025)
by: Yang, Wenkai, et al.
Published: (2025)
Generalization Bounds via Conditional $f$-Information
by: Wang, Ziqiao, et al.
Published: (2024)
by: Wang, Ziqiao, et al.
Published: (2024)
On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization
by: Lin, Yong, et al.
Published: (2024)
by: Lin, Yong, et al.
Published: (2024)
SteinGen: Generating Fidelitous and Diverse Graph Samples
by: Reinert, Gesine, et al.
Published: (2024)
by: Reinert, Gesine, et al.
Published: (2024)
Theoretical Analysis of Weak-to-Strong Generalization
by: Lang, Hunter, et al.
Published: (2024)
by: Lang, Hunter, et al.
Published: (2024)
Quantifying the Gain in Weak-to-Strong Generalization
by: Charikar, Moses, et al.
Published: (2024)
by: Charikar, Moses, et al.
Published: (2024)
Two Facets of SDE Under an Information-Theoretic Lens: Generalization of SGD via Training Trajectories and via Terminal States
by: Wang, Ziqiao, et al.
Published: (2022)
by: Wang, Ziqiao, et al.
Published: (2022)
Weak-to-Strong Generalization with Failure Trajectories: A Tree-based Approach to Elicit Optimal Policy in Strong Models
by: Ye, Ruimeng, et al.
Published: (2025)
by: Ye, Ruimeng, et al.
Published: (2025)
DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models
by: Yin, Cheng, et al.
Published: (2025)
by: Yin, Cheng, et al.
Published: (2025)
On the Mechanisms of Weak-to-Strong Generalization: A Theoretical Perspective
by: Moniri, Behrad, et al.
Published: (2025)
by: Moniri, Behrad, et al.
Published: (2025)
Provable Weak-to-Strong Generalization via Benign Overfitting
by: Wu, David X., et al.
Published: (2024)
by: Wu, David X., et al.
Published: (2024)
Weak-to-Strong Generalization is Nearly Inevitable (in Linear Models)
by: Geng, Scott, et al.
Published: (2026)
by: Geng, Scott, et al.
Published: (2026)
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding
by: Yang, Wenkai, et al.
Published: (2025)
by: Yang, Wenkai, et al.
Published: (2025)
Weak-to-Strong Generalization under Distribution Shifts
by: Jeon, Myeongho, et al.
Published: (2025)
by: Jeon, Myeongho, et al.
Published: (2025)
Generalization in Federated Learning: A Conditional Mutual Information Framework
by: Wang, Ziqiao, et al.
Published: (2025)
by: Wang, Ziqiao, et al.
Published: (2025)
Rethinking Training Dynamics in Scale-wise Autoregressive Generation
by: Zhou, Gengze, et al.
Published: (2025)
by: Zhou, Gengze, et al.
Published: (2025)
Weak-to-Strong Generalization Even in Random Feature Networks, Provably
by: Medvedev, Marko, et al.
Published: (2025)
by: Medvedev, Marko, et al.
Published: (2025)
Generalizing Trust: Weak-to-Strong Trustworthiness in Language Models
by: Pawelczyk, Martin, et al.
Published: (2024)
by: Pawelczyk, Martin, et al.
Published: (2024)
Weak-to-Strong Generalization Through the Data-Centric Lens
by: Shin, Changho, et al.
Published: (2024)
by: Shin, Changho, et al.
Published: (2024)
Zero-to-Strong Generalization: Eliciting Strong Capabilities of Large Language Models Iteratively without Gold Labels
by: Liu, Chaoqun, et al.
Published: (2024)
by: Liu, Chaoqun, et al.
Published: (2024)
The Mechanism of Weak-to-Strong Generalization: Feature Elicitation from Latent Knowledge
by: Awano, Ryoya, et al.
Published: (2026)
by: Awano, Ryoya, et al.
Published: (2026)
Bayesian WeakS-to-Strong from Text Classification to Generation
by: Cui, Ziyun, et al.
Published: (2024)
by: Cui, Ziyun, et al.
Published: (2024)
Exploring the Generalization Capabilities of AID-based Bi-level Optimization
by: Chen, Congliang, et al.
Published: (2024)
by: Chen, Congliang, et al.
Published: (2024)
From Linear to Nonlinear: Provable Weak-to-Strong Generalization through Feature Learning
by: Oh, Junsoo, et al.
Published: (2025)
by: Oh, Junsoo, et al.
Published: (2025)
High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws
by: Ildiz, M. Emrullah, et al.
Published: (2024)
by: Ildiz, M. Emrullah, et al.
Published: (2024)
Trust Functions: Near-Lossless Weak-to-Strong Generalization by Learning When to Trust the Weak Teacher
by: Uzunoglu, Arda, et al.
Published: (2026)
by: Uzunoglu, Arda, et al.
Published: (2026)
Capabilities and Fundamental Limits of Latent Chain-of-Thought
by: Zou, Jiaxuan, et al.
Published: (2026)
by: Zou, Jiaxuan, et al.
Published: (2026)
Energy-Guided Generative Modeling for Low-Energy Molecular Structure Discovery
by: Xu, Guikun, et al.
Published: (2025)
by: Xu, Guikun, et al.
Published: (2025)
Weak-to-Strong Generalization beyond Accuracy: a Pilot Study in Safety, Toxicity, and Legal Reasoning
by: Ye, Ruimeng, et al.
Published: (2024)
by: Ye, Ruimeng, et al.
Published: (2024)
Representations Shape Weak-to-Strong Generalization: Theoretical Insights and Empirical Predictions
by: Xue, Yihao, et al.
Published: (2025)
by: Xue, Yihao, et al.
Published: (2025)
Discrepancies are Virtue: Weak-to-Strong Generalization through Lens of Intrinsic Dimension
by: Dong, Yijun, et al.
Published: (2025)
by: Dong, Yijun, et al.
Published: (2025)
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards
by: Zhang, Kaiyi, et al.
Published: (2026)
by: Zhang, Kaiyi, et al.
Published: (2026)
Similar Items
-
On Weak-to-Strong Generalization and f-Divergence
by: Yao, Wei, et al.
Published: (2025) -
On the Emergence of Weak-to-Strong Generalization: A Bias-Variance Perspective
by: Xu, Gengze, et al.
Published: (2025) -
Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL
by: Yao, Wei, et al.
Published: (2025) -
On the Blessing of Pre-training in Weak-to-Strong Generalization
by: Yao, Wei, et al.
Published: (2026) -
Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
by: Yao, Xinhao, et al.
Published: (2024)