:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Fu, Guoji, Suzuki, Taiji, Lee, Wee Sun, Nitanda, Atsushi
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Machine Learning
Online-Zugang:	https://arxiv.org/abs/2605.15822
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Approximation and Generalization Abilities of Score-based Neural Network Generative Models for Sub-Gaussian Distributions
von: Fu, Guoji, et al.
Veröffentlicht: (2025)

Towards a Unified Analysis of Neural Networks in Nonparametric Instrumental Variable Regression: Optimization and Generalization
von: Chen, Zonghao, et al.
Veröffentlicht: (2025)

Direct Distributional Optimization for Provable Alignment of Diffusion Models
von: Kawata, Ryotaro, et al.
Veröffentlicht: (2025)

Implicit Graph Neural Diffusion Networks: Convergence, Generalization, and Over-Smoothing
von: Fu, Guoji, et al.
Veröffentlicht: (2023)

Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble
von: Nitanda, Atsushi, et al.
Veröffentlicht: (2025)

Koopman-based generalization bound: New aspect for full-rank weights
von: Hashimoto, Yuka, et al.
Veröffentlicht: (2023)

Improved Particle Approximation Error for Mean Field Neural Networks
von: Nitanda, Atsushi
Veröffentlicht: (2024)

DPRM: A Plug-in Doob h transform-induced Token-Ordering Module for Diffusion Language Models
von: Bu, Dake, et al.
Veröffentlicht: (2026)

Continual Reinforcement Learning by Planning with Online World Models
von: Liu, Zichen, et al.
Veröffentlicht: (2025)

Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning
von: Bu, Dake, et al.
Veröffentlicht: (2024)

Provable Benefit of Curriculum in Transformer Tree-Reasoning Post-Training
von: Bu, Dake, et al.
Veröffentlicht: (2025)

Provable In-Context Vector Arithmetic via Retrieving Task Concepts
von: Bu, Dake, et al.
Veröffentlicht: (2025)

Post-Training as Reweighting: A Stochastic View of Reasoning Trajectories in Language Models
von: Bu, Dake, et al.
Veröffentlicht: (2025)

Alternating Diffusion for Proximal Sampling with Zeroth Order Queries
von: Takagi, Hirohane, et al.
Veröffentlicht: (2026)

Uniform convergence of the smooth calibration error and its relationship with functional gradient
von: Futami, Futoshi, et al.
Veröffentlicht: (2025)

How Does Preconditioning Guide Feature Learning in Deep Neural Networks?
von: Yoshida, Kotaro, et al.
Veröffentlicht: (2025)

From Saddle Points Toward Global Minima: A Newton-Type Method on Wasserstein Space
von: Lascu, Razvan-Andrei, et al.
Veröffentlicht: (2026)

Hessian-guided Perturbed Wasserstein Gradient Flows for Escaping Saddle Points
von: Yamamoto, Naoya, et al.
Veröffentlicht: (2025)

Mirror Descent Policy Optimisation for Robust Constrained Markov Decision Processes
von: Bossens, David M., et al.
Veröffentlicht: (2025)

Statistical Analysis of the Sinkhorn Iterations for Two-Sample Schrödinger Bridge Estimation
von: Maeda, Ibuki, et al.
Veröffentlicht: (2025)

Slowly Annealed Langevin Dynamics: Theory and Applications to Training-Free Guided Generation
von: Nitanda, Atsushi, et al.
Veröffentlicht: (2026)

The Mechanism of Weak-to-Strong Generalization: Feature Elicitation from Latent Knowledge
von: Awano, Ryoya, et al.
Veröffentlicht: (2026)

In-Context Learning Is Provably Bayesian Inference: A Generalization Theory for Meta-Learning
von: Wakayama, Tomoya, et al.
Veröffentlicht: (2025)

State Space Models are Provably Comparable to Transformers in Dynamic Token Selection
von: Nishikawa, Naoki, et al.
Veröffentlicht: (2024)

Constrained Layout Generation with Factor Graphs
von: Dupty, Mohammed Haroon, et al.
Veröffentlicht: (2024)

Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models
von: Higuchi, Rei, et al.
Veröffentlicht: (2025)

Why is parameter averaging beneficial in SGD? An objective smoothing perspective
von: Nitanda, Atsushi, et al.
Veröffentlicht: (2023)

Transformers as Measure-Theoretic Associative Memory: A Statistical Perspective and Minimax Optimality
von: Kawata, Ryotaro, et al.
Veröffentlicht: (2026)

Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
von: Kim, Juno, et al.
Veröffentlicht: (2024)

Deep Two-Way Matrix Reordering for Relational Data Analysis
von: Watanabe, Chihiro, et al.
Veröffentlicht: (2021)

Transformers Provably Solve Parity Efficiently with Chain of Thought
von: Kim, Juno, et al.
Veröffentlicht: (2024)

Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
von: Takakura, Shokichi, et al.
Veröffentlicht: (2024)

AutoLL: Automatic Linear Layout of Graphs based on Deep Neural Network
von: Watanabe, Chihiro, et al.
Veröffentlicht: (2021)

Test time training enhances in-context learning of nonlinear functions
von: Kuwataka, Kento, et al.
Veröffentlicht: (2025)

Approximation and Estimation Ability of Transformers for Sequence-to-Sequence Functions with Infinite Dimensional Input
von: Takakura, Shokichi, et al.
Veröffentlicht: (2023)

Provably Learning Diffusion Models under the Manifold Hypothesis: Collapse and Refine
von: Huang, Wei, et al.
Veröffentlicht: (2026)

Wasserstein Convergence Guarantees for a General Class of Score-Based Generative Models
von: Gao, Xuefeng, et al.
Veröffentlicht: (2023)

Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
von: Nishikawa, Naoki, et al.
Veröffentlicht: (2025)

Transformers are Minimax Optimal Nonparametric In-Context Learners
von: Kim, Juno, et al.
Veröffentlicht: (2024)

Mamba Can Learn Low-Dimensional Targets In-Context via Test-Time Feature Learning
von: Oh, Junsoo, et al.
Veröffentlicht: (2025)