:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Watanabe, Chihiro, Suzuki, Taiji
Formato:	Preprint
Publicado:	2021
Materias:	Machine Learning
Acceso en línea:	https://arxiv.org/abs/2103.14203
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

AutoLL: Automatic Linear Layout of Graphs based on Deep Neural Network
por: Watanabe, Chihiro, et al.
Publicado: (2021)

Mean-field Analysis on Two-layer Neural Networks from a Kernel Perspective
por: Takakura, Shokichi, et al.
Publicado: (2024)

Self-Supervised Learning for Sparse Matrix Reordering
por: Li, Ziwei, et al.
Publicado: (2026)

Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
por: Kim, Juno, et al.
Publicado: (2024)

The Mechanism of Weak-to-Strong Generalization: Feature Elicitation from Latent Knowledge
por: Awano, Ryoya, et al.
Publicado: (2026)

State Space Models are Provably Comparable to Transformers in Dynamic Token Selection
por: Nishikawa, Naoki, et al.
Publicado: (2024)

Transformers Provably Solve Parity Efficiently with Chain of Thought
por: Kim, Juno, et al.
Publicado: (2024)

Test time training enhances in-context learning of nonlinear functions
por: Kuwataka, Kento, et al.
Publicado: (2025)

Transformers as Measure-Theoretic Associative Memory: A Statistical Perspective and Minimax Optimality
por: Kawata, Ryotaro, et al.
Publicado: (2026)

In-Context Learning Is Provably Bayesian Inference: A Generalization Theory for Meta-Learning
por: Wakayama, Tomoya, et al.
Publicado: (2025)

Approximation and Estimation Ability of Transformers for Sequence-to-Sequence Functions with Infinite Dimensional Input
por: Takakura, Shokichi, et al.
Publicado: (2023)

Empirical Cumulative Distribution Function Clustering for LLM-based Agent System Analysis
por: Watanabe, Chihiro, et al.
Publicado: (2026)

MultiwayPAM: Multiway Partitioning Around Medoids for LLM-as-a-Judge Score Analysis
por: Watanabe, Chihiro, et al.
Publicado: (2026)

Direct Density Ratio Optimization: A Statistically Consistent Approach to Aligning Large Language Models
por: Higuchi, Rei, et al.
Publicado: (2025)

High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization
por: Chen, Yihang, et al.
Publicado: (2024)

Towards a Unified Analysis of Neural Networks in Nonparametric Instrumental Variable Regression: Optimization and Generalization
por: Chen, Zonghao, et al.
Publicado: (2025)

Bridging the Gap between Sparse Matrix Reordering and Factorization: A Deep Learning Framework for Fill-in Reduction
por: Li, Ziwei, et al.
Publicado: (2026)

Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression
por: Kim, Juno, et al.
Publicado: (2025)

From Saddle Points Toward Global Minima: A Newton-Type Method on Wasserstein Space
por: Lascu, Razvan-Andrei, et al.
Publicado: (2026)

Degrees of Freedom for Linear Attention: Distilling Softmax Attention with Optimal Feature Efficiency
por: Nishikawa, Naoki, et al.
Publicado: (2025)

Transformers are Minimax Optimal Nonparametric In-Context Learners
por: Kim, Juno, et al.
Publicado: (2024)

Mamba Can Learn Low-Dimensional Targets In-Context via Test-Time Feature Learning
por: Oh, Junsoo, et al.
Publicado: (2025)

Factorization-in-Loop: Proximal Fill-in Minimization for Sparse Matrix Reordering
por: Li, Ziwei, et al.
Publicado: (2025)

Hessian-guided Perturbed Wasserstein Gradient Flows for Escaping Saddle Points
por: Yamamoto, Naoya, et al.
Publicado: (2025)

Convergence Error Analysis of Reflected Gradient Langevin Dynamics for Globally Optimizing Non-Convex Constrained Problems
por: Sato, Kanji, et al.
Publicado: (2022)

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
por: Li, Bingrui, et al.
Publicado: (2024)

A Relative-Budget Theory for Reinforcement Learning with Verifiable Rewards in Large Language Model Reasoning
por: Wachi, Akifumi, et al.
Publicado: (2026)

Order Matters: Improving Domain Adaptation by Reordering Data
por: Napoli, Andrea, et al.
Publicado: (2026)

Zero-Flow Encoders
por: Wang, Yakun, et al.
Publicado: (2026)

Direct Distributional Optimization for Provable Alignment of Diffusion Models
por: Kawata, Ryotaro, et al.
Publicado: (2025)

Learning sum of diverse features: computational hardness and efficient gradient-based training for ridge combinations
por: Oko, Kazusato, et al.
Publicado: (2024)

Pretrained transformer efficiently learns low-dimensional target functions in-context
por: Oko, Kazusato, et al.
Publicado: (2024)

Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel
por: Chen, Yilan, et al.
Publicado: (2025)

GE2E-AC: Generalized End-to-End Loss Training for Accent Classification
por: Watanabe, Chihiro, et al.
Publicado: (2024)

How Neural Reward Models Learn Features for Policy Optimization: A Single-Index Analysis
por: Higuchi, Rei, et al.
Publicado: (2026)

On the Learning Dynamics of Two-layer Linear Networks with Label Noise SGD
por: Zhang, Tongcheng, et al.
Publicado: (2026)

Intrinsic Wasserstein Rates for Score-Based Generative Models on Smooth Manifolds
por: Fu, Guoji, et al.
Publicado: (2026)

Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
por: Lee, Jason D., et al.
Publicado: (2024)

Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation
por: Kim, Juno, et al.
Publicado: (2025)

From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers
por: Kawata, Ryotaro, et al.
Publicado: (2025)