Saved in:
| Main Authors: | Li, Yuanpeng, Hestness, Joel, Elhoseiny, Mohamed, Zhao, Liang, Church, Kenneth |
|---|---|
| Format: | Preprint |
| Published: |
2022
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2201.01942 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Sparse maximal update parameterization: A holistic approach to sparse training dynamics
by: Dey, Nolan, et al.
Published: (2024)
by: Dey, Nolan, et al.
Published: (2024)
No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages
by: Mohamed, Youssef, et al.
Published: (2024)
by: Mohamed, Youssef, et al.
Published: (2024)
Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs
by: Bergsma, Shane, et al.
Published: (2025)
by: Bergsma, Shane, et al.
Published: (2025)
MediSwift: Efficient Sparse Pre-trained Biomedical Language Models
by: Thangarasa, Vithursan, et al.
Published: (2024)
by: Thangarasa, Vithursan, et al.
Published: (2024)
Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
by: Gray, Gavia, et al.
Published: (2024)
by: Gray, Gavia, et al.
Published: (2024)
Structural Disentanglement of Causal and Correlated Concepts
by: Zhao, Qilong, et al.
Published: (2024)
by: Zhao, Qilong, et al.
Published: (2024)
Disentangled Representations for Causal Cognition
by: Torresan, Filippo, et al.
Published: (2024)
by: Torresan, Filippo, et al.
Published: (2024)
Disentangled Representation Learning for Causal Inference with Instruments
by: Cheng, Debo, et al.
Published: (2024)
by: Cheng, Debo, et al.
Published: (2024)
Scaling with Collapse: Efficient and Predictable Training of LLM Families
by: Bergsma, Shane, et al.
Published: (2025)
by: Bergsma, Shane, et al.
Published: (2025)
Learning Causally Disentangled Representations via the Principle of Independent Causal Mechanisms
by: Komanduri, Aneesh, et al.
Published: (2023)
by: Komanduri, Aneesh, et al.
Published: (2023)
Intervening to Learn and Compose Causally Disentangled Representations
by: Markham, Alex, et al.
Published: (2025)
by: Markham, Alex, et al.
Published: (2025)
Causal Flow-based Variational Auto-Encoder for Disentangled Causal Representation Learning
by: Fan, Di, et al.
Published: (2023)
by: Fan, Di, et al.
Published: (2023)
Linear Causal Representation Learning by Topological Ordering, Pruning, and Disentanglement
by: Chen, Hao, et al.
Published: (2025)
by: Chen, Hao, et al.
Published: (2025)
Disentangling Causal Substructures for Interpretable and Generalizable Drug Synergy Prediction
by: Luo, Yi, et al.
Published: (2025)
by: Luo, Yi, et al.
Published: (2025)
Causality-Driven Disentangled Representation Learning in Multiplex Graphs
by: Nasiri, Saba, et al.
Published: (2026)
by: Nasiri, Saba, et al.
Published: (2026)
CoRPO: Adding a Correctness Bias to GRPO Improves Generalization
by: Garg, Anisha, et al.
Published: (2025)
by: Garg, Anisha, et al.
Published: (2025)
A Theoretical Analysis of Compositional Generalization in Neural Networks: A Necessary and Sufficient Condition
by: Li, Yuanpeng
Published: (2025)
by: Li, Yuanpeng
Published: (2025)
Disentangling Dynamical Systems: Causal Representation Learning Meets Local Sparse Attention
by: Baumgartner, Markus W., et al.
Published: (2026)
by: Baumgartner, Markus W., et al.
Published: (2026)
Power Lines: Scaling Laws for Weight Decay and Batch Size in LLM Pre-training
by: Bergsma, Shane, et al.
Published: (2025)
by: Bergsma, Shane, et al.
Published: (2025)
Domain-Aware Continual Zero-Shot Learning
by: Yi, Kai, et al.
Published: (2021)
by: Yi, Kai, et al.
Published: (2021)
Fair Graph Representation Learning via Sensitive Attribute Disentanglement
by: Zhu, Yuchang, et al.
Published: (2024)
by: Zhu, Yuchang, et al.
Published: (2024)
Temporal Positive-unlabeled Learning for Biomedical Hypothesis Generation via Risk Estimation
by: Akujuobi, Uchenna, et al.
Published: (2020)
by: Akujuobi, Uchenna, et al.
Published: (2020)
Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs
by: Bergsma, Shane, et al.
Published: (2025)
by: Bergsma, Shane, et al.
Published: (2025)
Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models
by: Zhang, Wenxuan, et al.
Published: (2024)
by: Zhang, Wenxuan, et al.
Published: (2024)
CompoSE: Compositional Synthesis and Editing of 3D Shapes via Part-Aware Control
by: Slim, Habib, et al.
Published: (2026)
by: Slim, Habib, et al.
Published: (2026)
Disentangled Double Machine Learning for Accurate Causal Effect Estimation
by: Xiang, Guodu, et al.
Published: (2026)
by: Xiang, Guodu, et al.
Published: (2026)
Variational Learning of Disentangled Representations
by: Slavutsky, Yuli, et al.
Published: (2025)
by: Slavutsky, Yuli, et al.
Published: (2025)
A Shared Valence Axis Across Modern LLMs and Human EEG: The Saturation Regularity
by: Radwan, Yousef A., et al.
Published: (2026)
by: Radwan, Yousef A., et al.
Published: (2026)
Rethinking State Disentanglement in Causal Reinforcement Learning
by: Cao, Haiyao, et al.
Published: (2024)
by: Cao, Haiyao, et al.
Published: (2024)
Freshness-Aware Prioritized Experience Replay for LLM/VLM Reinforcement Learning
by: Ma, Weiyu, et al.
Published: (2026)
by: Ma, Weiyu, et al.
Published: (2026)
Disentangled Representation Learning
by: Wang, Xin, et al.
Published: (2022)
by: Wang, Xin, et al.
Published: (2022)
Query-based Knowledge Transfer for Heterogeneous Learning Environments
by: Alballa, Norah, et al.
Published: (2025)
by: Alballa, Norah, et al.
Published: (2025)
Disentangled Representation Learning via Flow Matching
by: Chi, Jinjin, et al.
Published: (2026)
by: Chi, Jinjin, et al.
Published: (2026)
Federated Granger Causality Learning for Interdependent Clients with State Space Representation
by: Mohanty, Ayush, et al.
Published: (2025)
by: Mohanty, Ayush, et al.
Published: (2025)
Don't be lazy: CompleteP enables compute-efficient deep transformers
by: Dey, Nolan, et al.
Published: (2025)
by: Dey, Nolan, et al.
Published: (2025)
Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation
by: Zhang, Wenxuan, et al.
Published: (2024)
by: Zhang, Wenxuan, et al.
Published: (2024)
Inductive Subgraphs as Shortcuts: Causal Disentanglement for Heterophilic Graph Learning
by: Wang, Xiangmeng, et al.
Published: (2026)
by: Wang, Xiangmeng, et al.
Published: (2026)
Disentangled Generative Graph Representation Learning
by: Hu, Xinyue, et al.
Published: (2024)
by: Hu, Xinyue, et al.
Published: (2024)
Learning Time-Aware Causal Representation for Model Generalization in Evolving Domains
by: He, Zhuo, et al.
Published: (2025)
by: He, Zhuo, et al.
Published: (2025)
Disentangled Hyperbolic Representation Learning for Heterogeneous Graphs
by: Bai, Qijie, et al.
Published: (2024)
by: Bai, Qijie, et al.
Published: (2024)
Similar Items
-
Sparse maximal update parameterization: A holistic approach to sparse training dynamics
by: Dey, Nolan, et al.
Published: (2024) -
No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages
by: Mohamed, Youssef, et al.
Published: (2024) -
Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs
by: Bergsma, Shane, et al.
Published: (2025) -
MediSwift: Efficient Sparse Pre-trained Biomedical Language Models
by: Thangarasa, Vithursan, et al.
Published: (2024) -
Normalization Layer Per-Example Gradients are Sufficient to Predict Gradient Noise Scale in Transformers
by: Gray, Gavia, et al.
Published: (2024)