Saved in:
| Main Authors: | Pai, Druv, Wu, Ziyang, Buchanan, Sam, Yu, Yaodong, Ma, Yi |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.02446 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?
by: Yu, Yaodong, et al.
Published: (2023)
by: Yu, Yaodong, et al.
Published: (2023)
On the Edge of Memorization in Diffusion Models
by: Buchanan, Sam, et al.
Published: (2025)
by: Buchanan, Sam, et al.
Published: (2025)
Attention-Only Transformers via Unrolled Subspace Denoising
by: Wang, Peng, et al.
Published: (2025)
by: Wang, Peng, et al.
Published: (2025)
Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction
by: Wu, Ziyang, et al.
Published: (2024)
by: Wu, Ziyang, et al.
Published: (2024)
Scaling White-Box Transformers for Vision
by: Yang, Jinrui, et al.
Published: (2024)
by: Yang, Jinrui, et al.
Published: (2024)
A Global Geometric Analysis of Maximal Coding Rate Reduction
by: Wang, Peng, et al.
Published: (2024)
by: Wang, Peng, et al.
Published: (2024)
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
by: Guo, Tianyu, et al.
Published: (2024)
by: Guo, Tianyu, et al.
Published: (2024)
Diffusion-BBO: Diffusion-Based Inverse Modeling for Online Black-Box Optimization
by: Wu, Dongxia, et al.
Published: (2024)
by: Wu, Dongxia, et al.
Published: (2024)
White-Box Diffusion Transformer for single-cell RNA-seq generation
by: Cui, Zhuorui, et al.
Published: (2024)
by: Cui, Zhuorui, et al.
Published: (2024)
Independent and Decentralized Learning in Markov Potential Games
by: Maheshwari, Chinmay, et al.
Published: (2022)
by: Maheshwari, Chinmay, et al.
Published: (2022)
Canonical Factors for Hybrid Neural Fields
by: Yi, Brent, et al.
Published: (2023)
by: Yi, Brent, et al.
Published: (2023)
Beyond Scores: Proximal Diffusion Models
by: Fang, Zhenghan, et al.
Published: (2025)
by: Fang, Zhenghan, et al.
Published: (2025)
Unlocking Interpretability for RF Sensing: A Complex-Valued White-Box Transformer
by: Zhang, Xie, et al.
Published: (2025)
by: Zhang, Xie, et al.
Published: (2025)
Where to Mask: Structure-Guided Masking for Graph Masked Autoencoders
by: Liu, Chuang, et al.
Published: (2024)
by: Liu, Chuang, et al.
Published: (2024)
Beyond Masked and Unmasked: Discrete Diffusion Models via Partial Masking
by: Chao, Chen-Hao, et al.
Published: (2025)
by: Chao, Chen-Hao, et al.
Published: (2025)
Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference
by: Fox, David, et al.
Published: (2026)
by: Fox, David, et al.
Published: (2026)
Learning Expressive Random Feature Models via Parametrized Activations
by: Ma, Zailin, et al.
Published: (2024)
by: Ma, Zailin, et al.
Published: (2024)
Masked BRep Autoencoder via Hierarchical Graph Transformer
by: Li, Yifei, et al.
Published: (2026)
by: Li, Yifei, et al.
Published: (2026)
Robust Guided Diffusion for Offline Black-Box Optimization
by: Chen, Can Sam, et al.
Published: (2024)
by: Chen, Can Sam, et al.
Published: (2024)
Benchmarking Instance-Centric Counterfactual Algorithms for XAI: From White Box to Black Box
by: Moreira, Catarina, et al.
Published: (2022)
by: Moreira, Catarina, et al.
Published: (2022)
Are Linear Regression Models White Box and Interpretable?
by: Salih, Ahmed M, et al.
Published: (2024)
by: Salih, Ahmed M, et al.
Published: (2024)
A Diffusion Model for Regular Time Series Generation from Irregular Data with Completion and Masking
by: Fadlon, Gal, et al.
Published: (2025)
by: Fadlon, Gal, et al.
Published: (2025)
Mask Is What DLLM Needs: A Masked Data Training Paradigm for Diffusion LLMs
by: Ma, Linrui, et al.
Published: (2026)
by: Ma, Linrui, et al.
Published: (2026)
Differentially Private Representation Learning via Image Captioning
by: Sander, Tom, et al.
Published: (2024)
by: Sander, Tom, et al.
Published: (2024)
On the Generalization Properties of Learning the Random Feature Models with Learnable Activation Functions
by: Ma, Zailin, et al.
Published: (2025)
by: Ma, Zailin, et al.
Published: (2025)
Turning Black Box into White Box: Dataset Distillation Leaks
by: Chen, Huajie, et al.
Published: (2026)
by: Chen, Huajie, et al.
Published: (2026)
What's in a Prior? Learned Proximal Networks for Inverse Problems
by: Fang, Zhenghan, et al.
Published: (2023)
by: Fang, Zhenghan, et al.
Published: (2023)
Simplifying DINO via Coding Rate Regularization
by: Wu, Ziyang, et al.
Published: (2025)
by: Wu, Ziyang, et al.
Published: (2025)
On the Trainability of Masked Diffusion Language Models via Blockwise Locality
by: Wang, Yuxiang, et al.
Published: (2026)
by: Wang, Yuxiang, et al.
Published: (2026)
On the Role of Attention Masks and LayerNorm in Transformers
by: Wu, Xinyi, et al.
Published: (2024)
by: Wu, Xinyi, et al.
Published: (2024)
Operational Feature Fingerprints of Graph Datasets via a White-Box Signal-Subspace Probe
by: Xiong, Yuchen, et al.
Published: (2026)
by: Xiong, Yuchen, et al.
Published: (2026)
Precipitation Nowcasting Using Diffusion Transformer with Causal Attention
by: Li, ChaoRong, et al.
Published: (2024)
by: Li, ChaoRong, et al.
Published: (2024)
Integrating White and Black Box Techniques for Interpretable Machine Learning
by: Vernon, Eric M., et al.
Published: (2024)
by: Vernon, Eric M., et al.
Published: (2024)
Fast Training of Diffusion Models with Masked Transformers
by: Zheng, Hongkai, et al.
Published: (2023)
by: Zheng, Hongkai, et al.
Published: (2023)
Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration
by: Xu, Ran, et al.
Published: (2025)
by: Xu, Ran, et al.
Published: (2025)
Towards White Box Deep Learning
by: Satkiewicz, Maciej
Published: (2024)
by: Satkiewicz, Maciej
Published: (2024)
Curriculum Learning for Efficient Chain-of-Thought Distillation via Structure-Aware Masking and GRPO
by: Yu, Bowen, et al.
Published: (2026)
by: Yu, Bowen, et al.
Published: (2026)
DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking
by: Turok, Gilad, et al.
Published: (2026)
by: Turok, Gilad, et al.
Published: (2026)
From Black-Box to White-Box: Control-Theoretic Neural Network Interpretability
by: Moon, Jihoon
Published: (2025)
by: Moon, Jihoon
Published: (2025)
EdgeMask-DG*: Learning Domain-Invariant Graph Structures via Adversarial Edge Masking
by: Bhattacharya, Rishabh, et al.
Published: (2026)
by: Bhattacharya, Rishabh, et al.
Published: (2026)
Similar Items
-
White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?
by: Yu, Yaodong, et al.
Published: (2023) -
On the Edge of Memorization in Diffusion Models
by: Buchanan, Sam, et al.
Published: (2025) -
Attention-Only Transformers via Unrolled Subspace Denoising
by: Wang, Peng, et al.
Published: (2025) -
Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction
by: Wu, Ziyang, et al.
Published: (2024) -
Scaling White-Box Transformers for Vision
by: Yang, Jinrui, et al.
Published: (2024)