Saved in:
| Main Authors: | Gan, Zeyu, Ren, Ruifeng, Yao, Wei, Hu, Xiaolin, Xu, Gengze, Qian, Chen, Tang, Huayi, Gong, Zixuan, Yao, Xinhao, Tang, Pengwei, Dou, Zhenxing, Liu, Yong |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.02907 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
by: Gong, Zixuan, et al.
Published: (2025)
by: Gong, Zixuan, et al.
Published: (2025)
Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
by: Yao, Xinhao, et al.
Published: (2024)
by: Yao, Xinhao, et al.
Published: (2024)
Transformers as Intrinsic Optimizers: Forward Inference through the Energy Principle
by: Ren, Ruifeng, et al.
Published: (2025)
by: Ren, Ruifeng, et al.
Published: (2025)
On Weak-to-Strong Generalization and f-Divergence
by: Yao, Wei, et al.
Published: (2025)
by: Yao, Wei, et al.
Published: (2025)
ADePT: Adaptive Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
by: Tang, Pengwei, et al.
Published: (2025)
by: Tang, Pengwei, et al.
Published: (2025)
Sparsity is Combinatorial Depth: Quantifying MoE Expressivity via Tropical Geometry
by: Su, Ye, et al.
Published: (2026)
by: Su, Ye, et al.
Published: (2026)
Compositional Generalization from Learned Skills via CoT Training: A Theoretical and Structural Analysis for Reasoning
by: Yao, Xinhao, et al.
Published: (2025)
by: Yao, Xinhao, et al.
Published: (2025)
Effective Frontiers: A Unification of Neural Scaling Laws
by: Zou, Jiaxuan, et al.
Published: (2026)
by: Zou, Jiaxuan, et al.
Published: (2026)
Enhancing In-Context Learning Performance with just SVD-Based Weight Pruning: A Theoretical Perspective
by: Yao, Xinhao, et al.
Published: (2024)
by: Yao, Xinhao, et al.
Published: (2024)
Understanding Model Ensemble in Transferable Adversarial Attack
by: Yao, Wei, et al.
Published: (2024)
by: Yao, Wei, et al.
Published: (2024)
Information-Theoretic Generalization Bounds for Transductive Learning and its Applications
by: Tang, Huayi, et al.
Published: (2023)
by: Tang, Huayi, et al.
Published: (2023)
PAC-Bayesian Generalization Bounds for Graph Convolutional Networks on Inductive Node Classification
by: Tang, Huayi, et al.
Published: (2025)
by: Tang, Huayi, et al.
Published: (2025)
On the Emergence of Weak-to-Strong Generalization: A Bias-Variance Perspective
by: Xu, Gengze, et al.
Published: (2025)
by: Xu, Gengze, et al.
Published: (2025)
Perfect Alignment May be Poisonous to Graph Contrastive Learning
by: Liu, Jingyu, et al.
Published: (2023)
by: Liu, Jingyu, et al.
Published: (2023)
Put the Space of LoRA Initialization to the Extreme to Preserve Pre-trained Knowledge
by: Tang, Pengwei, et al.
Published: (2025)
by: Tang, Pengwei, et al.
Published: (2025)
The Debate on RLVR Reasoning Capability Boundary: Shrinkage, Expansion, or Both? A Two-Stage Dynamic View
by: Yao, Xinhao, et al.
Published: (2025)
by: Yao, Xinhao, et al.
Published: (2025)
Learning Mixture-of-Experts for General-Purpose Black-Box Discrete Optimization
by: Liu, Shengcai, et al.
Published: (2024)
by: Liu, Shengcai, et al.
Published: (2024)
A Method for Constructing a Digital Transformation Driving Mechanism Based on Semantic Understanding of Large Models
by: Liu, Huayi
Published: (2026)
by: Liu, Huayi
Published: (2026)
Active Semantic Perception
by: Tang, Huayi, et al.
Published: (2025)
by: Tang, Huayi, et al.
Published: (2025)
Improved Learning Rates for Stochastic Optimization
by: Li, Shaojie, et al.
Published: (2021)
by: Li, Shaojie, et al.
Published: (2021)
Breaking the Black-Box: Confidence-Guided Model Inversion Attack for Distribution Shift
by: Liu, Xinhao, et al.
Published: (2024)
by: Liu, Xinhao, et al.
Published: (2024)
The Capabilities and Limitations of Weak-to-Strong Generalization: Generalization and Calibration
by: Yao, Wei, et al.
Published: (2025)
by: Yao, Wei, et al.
Published: (2025)
A Survey of Calibration Process for Black-Box LLMs
by: Xie, Liangru, et al.
Published: (2024)
by: Xie, Liangru, et al.
Published: (2024)
No Black Box Anymore: Demystifying Clinical Predictive Modeling with Temporal-Feature Cross Attention Mechanism
by: Li, Yubo, et al.
Published: (2025)
by: Li, Yubo, et al.
Published: (2025)
On the Blessing of Pre-training in Weak-to-Strong Generalization
by: Yao, Wei, et al.
Published: (2026)
by: Yao, Wei, et al.
Published: (2026)
Towards Understanding How Transformers Learn In-context Through a Representation Learning Lens
by: Ren, Ruifeng, et al.
Published: (2023)
by: Ren, Ruifeng, et al.
Published: (2023)
Revisiting Transformers through the Lens of Low Entropy and Dynamic Sparsity
by: Ren, Ruifeng, et al.
Published: (2025)
by: Ren, Ruifeng, et al.
Published: (2025)
Opening the Black Box: A Survey on the Mechanisms of Multi-Step Reasoning in Large Language Models
by: Pan, Liangming, et al.
Published: (2026)
by: Pan, Liangming, et al.
Published: (2026)
Multimodal Hierarchical Attention Framework for Efficient Weakly Supervised Few‐Shot Segmentation Under SAGIN Environment
by: Wenqiang Yuan, et al.
Published: (2025)
by: Wenqiang Yuan, et al.
Published: (2025)
Beyond the Black Box: A Cognitive Architecture for Explainable and Aligned AI
by: Keyi, Hu
Published: (2025)
by: Keyi, Hu
Published: (2025)
Efficient Black-Box Fault Localization for System-Level Test Code Using Large Language Models
by: Yaraghi, Ahmadreza Saboor, et al.
Published: (2025)
by: Yaraghi, Ahmadreza Saboor, et al.
Published: (2025)
SyncGuard: Robust Audio Watermarking Capable of Countering Desynchronization Attacks
by: Gan, Zhenliang, et al.
Published: (2025)
by: Gan, Zhenliang, et al.
Published: (2025)
Efficient Non-Parametric Uncertainty Quantification for Black-Box Large Language Models and Decision Planning
by: Tsai, Yao-Hung Hubert, et al.
Published: (2024)
by: Tsai, Yao-Hung Hubert, et al.
Published: (2024)
Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective
by: Gan, Zeyu, et al.
Published: (2024)
by: Gan, Zeyu, et al.
Published: (2024)
Beyond Entropy: Region Confidence Proxy for Wild Test-Time Adaptation
by: Hu, Zixuan, et al.
Published: (2025)
by: Hu, Zixuan, et al.
Published: (2025)
Pediatric Chronic Monteggia Fractures: Insights From a Comprehensive Review
by: Gengze Li, et al.
Published: (2025)
by: Gengze Li, et al.
Published: (2025)
Learning to Learn from APIs: Black-Box Data-Free Meta-Learning
by: Hu, Zixuan, et al.
Published: (2023)
by: Hu, Zixuan, et al.
Published: (2023)
Probing Stochastic Ultralight Dark Matter with Space-based Gravitational-Wave Interferometers
by: Yao, Yue-Hui, et al.
Published: (2024)
by: Yao, Yue-Hui, et al.
Published: (2024)
Beyond Accuracy: A Geometric Stability Analysis of Large Language Models in Chess Evaluation
by: Song, Xidan, et al.
Published: (2025)
by: Song, Xidan, et al.
Published: (2025)
Robust Guided Diffusion for Offline Black-Box Optimization
by: Chen, Can Sam, et al.
Published: (2024)
by: Chen, Can Sam, et al.
Published: (2024)
Similar Items
-
Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization
by: Gong, Zixuan, et al.
Published: (2025) -
Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization
by: Yao, Xinhao, et al.
Published: (2024) -
Transformers as Intrinsic Optimizers: Forward Inference through the Energy Principle
by: Ren, Ruifeng, et al.
Published: (2025) -
On Weak-to-Strong Generalization and f-Divergence
by: Yao, Wei, et al.
Published: (2025) -
ADePT: Adaptive Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
by: Tang, Pengwei, et al.
Published: (2025)