Saved in:
| Main Author: | Li, Henry |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.07935 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Likelihood Training of Cascaded Diffusion Models via Hierarchical Volume-preserving Maps
by: Li, Henry, et al.
Published: (2025)
by: Li, Henry, et al.
Published: (2025)
Hybrid Reward Normalization for Process-supervised Non-verifiable Agentic Tasks
by: Xu, Peiran, et al.
Published: (2025)
by: Xu, Peiran, et al.
Published: (2025)
Frequency Adaptive Normalization For Non-stationary Time Series Forecasting
by: Ye, Weiwei, et al.
Published: (2024)
by: Ye, Weiwei, et al.
Published: (2024)
Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time
by: Chen, Zixiang, et al.
Published: (2023)
by: Chen, Zixiang, et al.
Published: (2023)
Conda: Column-Normalized Adam for Training Large Language Models Faster
by: Wang, Junjie, et al.
Published: (2025)
by: Wang, Junjie, et al.
Published: (2025)
TimeAPN: Adaptive Amplitude-Phase Non-Stationarity Normalization for Time Series Forecasting
by: Hu, Yue, et al.
Published: (2026)
by: Hu, Yue, et al.
Published: (2026)
Disentangling Neural Disjunctive Normal Form Models
by: Baugh, Kexin Gu, et al.
Published: (2025)
by: Baugh, Kexin Gu, et al.
Published: (2025)
Mitigating Gradient Overlap in Deep Residual Networks with Gradient Normalization for Improved Non-Convex Optimization
by: Yun, Juyoung
Published: (2024)
by: Yun, Juyoung
Published: (2024)
Non-Cross Diffusion for Semantic Consistency
by: Zheng, Ziyang, et al.
Published: (2023)
by: Zheng, Ziyang, et al.
Published: (2023)
Chemistry-Inspired Diffusion with Non-Differentiable Guidance
by: Shen, Yuchen, et al.
Published: (2024)
by: Shen, Yuchen, et al.
Published: (2024)
Bridging Distribution Gaps in Time Series Foundation Model Pretraining with Prototype-Guided Normalization
by: Gong, Peiliang, et al.
Published: (2025)
by: Gong, Peiliang, et al.
Published: (2025)
Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes
by: Li, Bocheng, et al.
Published: (2025)
by: Li, Bocheng, et al.
Published: (2025)
On the Nonlinearity of Layer Normalization
by: Ni, Yunhao, et al.
Published: (2024)
by: Ni, Yunhao, et al.
Published: (2024)
AlphaGrad: Non-Linear Gradient Normalization Optimizer
by: Sane, Soham
Published: (2025)
by: Sane, Soham
Published: (2025)
Non-Markovian Discrete Diffusion with Causal Language Models
by: Zhang, Yangtian, et al.
Published: (2025)
by: Zhang, Yangtian, et al.
Published: (2025)
IBNorm: Information-Bottleneck Inspired Normalization for Representation Learning
by: Zou, Xiandong, et al.
Published: (2025)
by: Zou, Xiandong, et al.
Published: (2025)
Non-stationary Diffusion For Probabilistic Time Series Forecasting
by: Ye, Weiwei, et al.
Published: (2025)
by: Ye, Weiwei, et al.
Published: (2025)
TimeGMM: Single-Pass Probabilistic Forecasting via Adaptive Gaussian Mixture Models with Reversible Normalization
by: Liu, Lei, et al.
Published: (2026)
by: Liu, Lei, et al.
Published: (2026)
Non-Identical Diffusion Models in MIMO-OFDM Channel Generation
by: Yang, Yuzhi, et al.
Published: (2025)
by: Yang, Yuzhi, et al.
Published: (2025)
Does Your Optimizer Care How You Normalize? Normalization-Optimizer Coupling in LLM Training
by: Abouzeid, Abdelrahman
Published: (2026)
by: Abouzeid, Abdelrahman
Published: (2026)
ReLU's Revival: On the Entropic Overload in Normalization-Free Large Language Models
by: Jha, Nandan Kumar, et al.
Published: (2024)
by: Jha, Nandan Kumar, et al.
Published: (2024)
Interpretable Graph-Level Anomaly Detection via Contrast with Normal Prototypes
by: Zhao, Qiuran, et al.
Published: (2026)
by: Zhao, Qiuran, et al.
Published: (2026)
FaultDiffusion: Few-Shot Fault Time Series Generation with Diffusion Model
by: Xu, Yi, et al.
Published: (2025)
by: Xu, Yi, et al.
Published: (2025)
Knowledge Graph Embedding by Normalizing Flows
by: Xiao, Changyi, et al.
Published: (2024)
by: Xiao, Changyi, et al.
Published: (2024)
Scaling CrossQ with Weight Normalization
by: Palenicek, Daniel, et al.
Published: (2025)
by: Palenicek, Daniel, et al.
Published: (2025)
Normalized Architectures are Natively 4-Bit
by: Fishman, Maxim, et al.
Published: (2026)
by: Fishman, Maxim, et al.
Published: (2026)
Learning Rate Transfer in Normalized Transformers
by: Shigida, Boris, et al.
Published: (2026)
by: Shigida, Boris, et al.
Published: (2026)
On the Weight Dynamics of Deep Normalized Networks
by: Mehmeti-Göpel, Christian H. X. Ali, et al.
Published: (2023)
by: Mehmeti-Göpel, Christian H. X. Ali, et al.
Published: (2023)
Amortized Sampling with Transferable Normalizing Flows
by: Tan, Charlie B., et al.
Published: (2025)
by: Tan, Charlie B., et al.
Published: (2025)
BNPO: Beta Normalization Policy Optimization
by: Xiao, Changyi, et al.
Published: (2025)
by: Xiao, Changyi, et al.
Published: (2025)
ANAct: Adaptive Normalization for Activation Functions
by: Peiwen, Yuan, et al.
Published: (2022)
by: Peiwen, Yuan, et al.
Published: (2022)
System-Embedded Diffusion Bridge Models
by: Sobieski, Bartlomiej, et al.
Published: (2025)
by: Sobieski, Bartlomiej, et al.
Published: (2025)
Enabling Causal Discovery in Post-Nonlinear Models with Normalizing Flows
by: Hoang, Nu, et al.
Published: (2024)
by: Hoang, Nu, et al.
Published: (2024)
K-Score: Kalman Filter as a Principled Alternative to Reward Normalization in Reinforcement Learning
by: Xia, Zixuan, et al.
Published: (2026)
by: Xia, Zixuan, et al.
Published: (2026)
HoReN: Normalized Hopfield Retrieval for Large-Scale Sequential Model Editing
by: Fang, Yuan, et al.
Published: (2026)
by: Fang, Yuan, et al.
Published: (2026)
EEGDM: Learning EEG Representation with Latent Diffusion Model
by: Wang, Shaocong, et al.
Published: (2025)
by: Wang, Shaocong, et al.
Published: (2025)
Dynamic Population Distribution Aware Human Trajectory Generation with Diffusion Model
by: Long, Qingyue, et al.
Published: (2025)
by: Long, Qingyue, et al.
Published: (2025)
Breaking the Factorization Barrier in Diffusion Language Models
by: Li, Ian, et al.
Published: (2026)
by: Li, Ian, et al.
Published: (2026)
IDLM: Inverse-distilled Diffusion Language Models
by: Li, David, et al.
Published: (2026)
by: Li, David, et al.
Published: (2026)
Anomaly Detection and Generation with Diffusion Models: A Survey
by: Liu, Yang, et al.
Published: (2025)
by: Liu, Yang, et al.
Published: (2025)
Similar Items
-
Likelihood Training of Cascaded Diffusion Models via Hierarchical Volume-preserving Maps
by: Li, Henry, et al.
Published: (2025) -
Hybrid Reward Normalization for Process-supervised Non-verifiable Agentic Tasks
by: Xu, Peiran, et al.
Published: (2025) -
Frequency Adaptive Normalization For Non-stationary Time Series Forecasting
by: Ye, Weiwei, et al.
Published: (2024) -
Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time
by: Chen, Zixiang, et al.
Published: (2023) -
Conda: Column-Normalized Adam for Training Large Language Models Faster
by: Wang, Junjie, et al.
Published: (2025)