Saved in:
| Main Authors: | Yokoi, Soma, Sato, Issei |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.12353 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Power Distribution Bridges Sampling, Self-Reward RL, and Self-Distillation
by: Tomihari, Akiyoshi, et al.
Published: (2026)
by: Tomihari, Akiyoshi, et al.
Published: (2026)
Can Test-time Computation Mitigate Reproduction Bias in Neural Symbolic Regression?
by: Sato, Shun, et al.
Published: (2025)
by: Sato, Shun, et al.
Published: (2025)
Max-pooling Network Revisited: Analyzing the Role of Semantic Probability in Multiple Instance Learning for Hallucination Detection
by: Fujikawa, Shota, et al.
Published: (2026)
by: Fujikawa, Shota, et al.
Published: (2026)
End-to-End Training Induces Information Bottleneck through Layer-Role Differentiation: A Comparative Analysis with Layer-wise Training
by: Sakamoto, Keitaro, et al.
Published: (2024)
by: Sakamoto, Keitaro, et al.
Published: (2024)
Benign Overfitting in Token Selection of Attention Mechanism
by: Sakamoto, Keitaro, et al.
Published: (2024)
by: Sakamoto, Keitaro, et al.
Published: (2024)
Multiplicative Logit Adjustment Approximates Neural-Collapse-Aware Decision Boundary Adjustment
by: Hasegawa, Naoya, et al.
Published: (2024)
by: Hasegawa, Naoya, et al.
Published: (2024)
On Expressive Power of Looped Transformers: Theoretical Analysis and Enhancement via Timestep Encoding
by: Xu, Kevin, et al.
Published: (2024)
by: Xu, Kevin, et al.
Published: (2024)
Understanding Linear Probing then Fine-tuning Language Models from NTK Perspective
by: Tomihari, Akiyoshi, et al.
Published: (2024)
by: Tomihari, Akiyoshi, et al.
Published: (2024)
Explaining Grokking and Information Bottleneck through Neural Collapse Emergence
by: Sakamoto, Keitaro, et al.
Published: (2025)
by: Sakamoto, Keitaro, et al.
Published: (2025)
Exploring Weight Balancing on Long-Tailed Recognition Problem
by: Hasegawa, Naoya, et al.
Published: (2023)
by: Hasegawa, Naoya, et al.
Published: (2023)
Rethinking Associative Memory Mechanism in Induction Head
by: Wang, Shuo, et al.
Published: (2024)
by: Wang, Shuo, et al.
Published: (2024)
On the Optimal Memorization Capacity of Transformers
by: Kajitsuka, Tokio, et al.
Published: (2024)
by: Kajitsuka, Tokio, et al.
Published: (2024)
Understanding Generalization in Physics Informed Models through Affine Variety Dimensions
by: Koshizuka, Takeshi, et al.
Published: (2025)
by: Koshizuka, Takeshi, et al.
Published: (2025)
To CoT or To Loop? A Formal Comparison Between Chain-of-Thought and Looped Transformers
by: Xu, Kevin, et al.
Published: (2025)
by: Xu, Kevin, et al.
Published: (2025)
Fix Initial Codes and Iteratively Refine Textual Directions Toward Safe Multi-Turn Code Correction
by: Tanaka, Yuto, et al.
Published: (2026)
by: Tanaka, Yuto, et al.
Published: (2026)
A Formal Comparison Between Chain of Thought and Latent Thought
by: Xu, Kevin, et al.
Published: (2025)
by: Xu, Kevin, et al.
Published: (2025)
Bayesian Symbolic Regression via Posterior Sampling
by: Bomarito, Geoffrey F., et al.
Published: (2025)
by: Bomarito, Geoffrey F., et al.
Published: (2025)
Understanding Transformer Optimization via Gradient Heterogeneity
by: Tomihari, Akiyoshi, et al.
Published: (2025)
by: Tomihari, Akiyoshi, et al.
Published: (2025)
Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?
by: Kajitsuka, Tokio, et al.
Published: (2023)
by: Kajitsuka, Tokio, et al.
Published: (2023)
Posterior Sampling-Based Bayesian Optimization with Tighter Bayesian Regret Bounds
by: Takeno, Shion, et al.
Published: (2023)
by: Takeno, Shion, et al.
Published: (2023)
Gibbs Sampling the Posterior of Neural Networks
by: Piccioli, Giovanni, et al.
Published: (2023)
by: Piccioli, Giovanni, et al.
Published: (2023)
Outlier-robust Diffusion Posterior Sampling for Bayesian Inverse Problems
by: Yang, Yiming, et al.
Published: (2026)
by: Yang, Yiming, et al.
Published: (2026)
Optimizing Posterior Samples for Bayesian Optimization via Rootfinding
by: Adebiyi, Taiwo A., et al.
Published: (2024)
by: Adebiyi, Taiwo A., et al.
Published: (2024)
Practical Bayesian Algorithm Execution via Posterior Sampling
by: Cheng, Chu Xin, et al.
Published: (2024)
by: Cheng, Chu Xin, et al.
Published: (2024)
Provable Diffusion Posterior Sampling for Bayesian Inversion
by: Chang, Jinyuan, et al.
Published: (2025)
by: Chang, Jinyuan, et al.
Published: (2025)
Understanding the Expressivity and Trainability of Fourier Neural Operator: A Mean-Field Perspective
by: Koshizuka, Takeshi, et al.
Published: (2023)
by: Koshizuka, Takeshi, et al.
Published: (2023)
Regret Analysis of Posterior Sampling-Based Expected Improvement for Bayesian Optimization
by: Takeno, Shion, et al.
Published: (2025)
by: Takeno, Shion, et al.
Published: (2025)
On the Convergence of Locally Adaptive and Scalable Diffusion-Based Sampling Methods for Deep Bayesian Neural Network Posteriors
by: Rensmeyer, Tim, et al.
Published: (2024)
by: Rensmeyer, Tim, et al.
Published: (2024)
On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors
by: Kobialka, Julius, et al.
Published: (2026)
by: Kobialka, Julius, et al.
Published: (2026)
Calibrating the Predictions for Top-N Recommendations
by: Sato, Masahiro
Published: (2024)
by: Sato, Masahiro
Published: (2024)
Generative Posterior Networks for Approximately Bayesian Epistemic Uncertainty Estimation
by: Roderick, Melrose, et al.
Published: (2023)
by: Roderick, Melrose, et al.
Published: (2023)
Sampling from Bayesian Neural Network Posteriors with Symmetric Minibatch Splitting Langevin Dynamics
by: Paulin, Daniel, et al.
Published: (2024)
by: Paulin, Daniel, et al.
Published: (2024)
Sketched Sum-Product Networks for Joins
by: Tsan, Brian, et al.
Published: (2025)
by: Tsan, Brian, et al.
Published: (2025)
Forward Learning with Top-Down Feedback: Empirical and Analytical Characterization
by: Srinivasan, Ravi, et al.
Published: (2023)
by: Srinivasan, Ravi, et al.
Published: (2023)
Productively Deploying Emerging Models on Emerging Platforms: A Top-Down Approach for Testing and Debugging
by: Feng, Siyuan, et al.
Published: (2024)
by: Feng, Siyuan, et al.
Published: (2024)
Posterior Sampling for Continuing Environments
by: Xu, Wanqiao, et al.
Published: (2022)
by: Xu, Wanqiao, et al.
Published: (2022)
Q-learning with Posterior Sampling
by: Agrawal, Priyank, et al.
Published: (2025)
by: Agrawal, Priyank, et al.
Published: (2025)
Flexible Bayesian Last Layer Models Using Implicit Priors and Diffusion Posterior Sampling
by: Xu, Jian, et al.
Published: (2024)
by: Xu, Jian, et al.
Published: (2024)
On the Overlooked Pitfalls of Weight Decay and How to Mitigate Them: A Gradient-Norm Perspective
by: Xie, Zeke, et al.
Published: (2020)
by: Xie, Zeke, et al.
Published: (2020)
Pareto Set Identification With Posterior Sampling
by: Kone, Cyrille, et al.
Published: (2024)
by: Kone, Cyrille, et al.
Published: (2024)
Similar Items
-
Power Distribution Bridges Sampling, Self-Reward RL, and Self-Distillation
by: Tomihari, Akiyoshi, et al.
Published: (2026) -
Can Test-time Computation Mitigate Reproduction Bias in Neural Symbolic Regression?
by: Sato, Shun, et al.
Published: (2025) -
Max-pooling Network Revisited: Analyzing the Role of Semantic Probability in Multiple Instance Learning for Hallucination Detection
by: Fujikawa, Shota, et al.
Published: (2026) -
End-to-End Training Induces Information Bottleneck through Layer-Role Differentiation: A Comparative Analysis with Layer-wise Training
by: Sakamoto, Keitaro, et al.
Published: (2024) -
Benign Overfitting in Token Selection of Attention Mechanism
by: Sakamoto, Keitaro, et al.
Published: (2024)