Saved in:
| Main Authors: | Sun, Ziteng, Benton, Adrian, Kushnir, Samuel, Trockman, Asher, Singh, Vikas, Diggavi, Suhas, Suresh, Ananda Theertha |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.19705 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multi-Mixer Models: Flexible Sequence Modeling with Shared Representations
by: Li, Kevin Y., et al.
Published: (2026)
by: Li, Kevin Y., et al.
Published: (2026)
The importance of feature preprocessing for differentially private linear optimization
by: Sun, Ziteng, et al.
Published: (2023)
by: Sun, Ziteng, et al.
Published: (2023)
Subset-Based Instance Optimality in Private Estimation
by: Dick, Travis, et al.
Published: (2023)
by: Dick, Travis, et al.
Published: (2023)
Private federated discovery of out-of-vocabulary words for Gboard
by: Sun, Ziteng, et al.
Published: (2024)
by: Sun, Ziteng, et al.
Published: (2024)
Exploring and Improving Drafts in Blockwise Parallel Decoding
by: Kim, Taehyeon, et al.
Published: (2024)
by: Kim, Taehyeon, et al.
Published: (2024)
Asymptotics of Language Model Alignment
by: Yang, Joy Qiping, et al.
Published: (2024)
by: Yang, Joy Qiping, et al.
Published: (2024)
CoDistill-GRPO: A Co-Distillation Recipe for Efficient Group Relative Policy Optimization
by: Kwon, Soo Min, et al.
Published: (2026)
by: Kwon, Soo Min, et al.
Published: (2026)
SpecTr: Fast Speculative Decoding via Optimal Transport
by: Sun, Ziteng, et al.
Published: (2023)
by: Sun, Ziteng, et al.
Published: (2023)
In-Context Credit Assignment via the Core
by: Harris, Keegan, et al.
Published: (2026)
by: Harris, Keegan, et al.
Published: (2026)
Coupling without Communication and Drafter-Invariant Speculative Decoding
by: Daliri, Majid, et al.
Published: (2024)
by: Daliri, Majid, et al.
Published: (2024)
On Robust Hypothesis Testing with respect to the Hellinger Distance
by: Modak, Eeshan, et al.
Published: (2025)
by: Modak, Eeshan, et al.
Published: (2025)
Mean estimation in the add-remove model of differential privacy
by: Kulesza, Alex, et al.
Published: (2023)
by: Kulesza, Alex, et al.
Published: (2023)
Mimetic Initialization of MLPs
by: Trockman, Asher, et al.
Published: (2026)
by: Trockman, Asher, et al.
Published: (2026)
ICQuant: Index Coding enables Low-bit LLM Quantization
by: Li, Xinlin, et al.
Published: (2025)
by: Li, Xinlin, et al.
Published: (2025)
An Information-Theoretic Approach to Understanding Transformers' In-Context Learning of Variable-Order Markov Chains
by: Zhou, Ruida, et al.
Published: (2024)
by: Zhou, Ruida, et al.
Published: (2024)
Block Verification Accelerates Speculative Decoding
by: Sun, Ziteng, et al.
Published: (2024)
by: Sun, Ziteng, et al.
Published: (2024)
Rate of Model Collapse in Recursive Training
by: Suresh, Ananda Theertha, et al.
Published: (2024)
by: Suresh, Ananda Theertha, et al.
Published: (2024)
SPIRE: Conditional Personalization for Federated Diffusion Generative Models
by: Ozkara, Kaan, et al.
Published: (2025)
by: Ozkara, Kaan, et al.
Published: (2025)
Multi-Group Fairness Evaluation via Conditional Value-at-Risk Testing
by: Paes, Lucas Monteiro, et al.
Published: (2023)
by: Paes, Lucas Monteiro, et al.
Published: (2023)
Reframing Data Value for Large Language Models Through the Lens of Plausibility
by: Rammal, Mohamad Rida, et al.
Published: (2024)
by: Rammal, Mohamad Rida, et al.
Published: (2024)
InfAlign: Inference-aware language model alignment
by: Balashankar, Ananth, et al.
Published: (2024)
by: Balashankar, Ananth, et al.
Published: (2024)
Efficient Language Model Architectures for Differentially Private Federated Learning
by: Ro, Jae Hun, et al.
Published: (2024)
by: Ro, Jae Hun, et al.
Published: (2024)
Robust Federated Personalised Mean Estimation for the Gaussian Mixture Model
by: Managoli, Malhar A., et al.
Published: (2025)
by: Managoli, Malhar A., et al.
Published: (2025)
Common Information Dimension
by: Hanna, Osama, et al.
Published: (2023)
by: Hanna, Osama, et al.
Published: (2023)
ADEPT: Hierarchical Bayes Approach to Personalized Federated Unsupervised Learning
by: Ozkara, Kaan, et al.
Published: (2024)
by: Ozkara, Kaan, et al.
Published: (2024)
Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe
by: You, Chong, et al.
Published: (2025)
by: You, Chong, et al.
Published: (2025)
Effect of Pre‐ and Post‐Milling Processing Techniques on the Physico‐Chemical, Functional, and Pasting Properties of Sorghum
by: Theertha DP, et al.
Published: (2025)
by: Theertha DP, et al.
Published: (2025)
On the optimal regret of collaborative personalized linear bandits
by: Huang, Bruce, et al.
Published: (2025)
by: Huang, Bruce, et al.
Published: (2025)
Normalized Narrow Jump To Conclusions: Normalized Narrow Shortcuts for Parameter Efficient Early Exit Transformer Prediction
by: Seshadri, Amrit Diggavi
Published: (2024)
by: Seshadri, Amrit Diggavi
Published: (2024)
Mimetic Initialization Helps State Space Models Learn to Recall
by: Trockman, Asher, et al.
Published: (2024)
by: Trockman, Asher, et al.
Published: (2024)
One Jump Is All You Need: Short-Cutting Transformers for Early Exit Prediction with One Jump to Fit All Exit Levels
by: Seshadri, Amrit Diggavi
Published: (2025)
by: Seshadri, Amrit Diggavi
Published: (2025)
Efficient and Asymptotically Unbiased Constrained Decoding for Large Language Models
by: Ye, Haotian, et al.
Published: (2025)
by: Ye, Haotian, et al.
Published: (2025)
Revisiting Adaptive Rounding with Vectorized Reparameterization for LLM Quantization
by: Zhou, Yuli, et al.
Published: (2026)
by: Zhou, Yuli, et al.
Published: (2026)
Theoretical guarantees on the best-of-n alignment policy
by: Beirami, Ahmad, et al.
Published: (2024)
by: Beirami, Ahmad, et al.
Published: (2024)
FrameQuant: Flexible Low-Bit Quantization for Transformers
by: Adepu, Harshavardhan, et al.
Published: (2024)
by: Adepu, Harshavardhan, et al.
Published: (2024)
CoreQ: Learning-Free Mismatch Correction and Successive Rounding for Quantization
by: Cha, Seohyeon, et al.
Published: (2026)
by: Cha, Seohyeon, et al.
Published: (2026)
Failure of the local-global principle for isotropy of quadratic forms over function fields
by: Auel, Asher, et al.
Published: (2017)
by: Auel, Asher, et al.
Published: (2017)
AT-2FF: Adaptive Type-2 Fuzzy Filter for De-noising Images Corrupted with Salt-and-Pepper
by: Singh, Vikas
Published: (2023)
by: Singh, Vikas
Published: (2023)
Simultaneous photonic and phononic bandgaps in a hexagonal lattice geometry with gradually transforming circular-to-triangular air gap holes
by: Bharadwaj, Suhas Suresh, et al.
Published: (2025)
by: Bharadwaj, Suhas Suresh, et al.
Published: (2025)
Chiplet-Based RISC-V SoC with Modular AI Acceleration
by: Bharadwaj, Suhas Suresh, et al.
Published: (2025)
by: Bharadwaj, Suhas Suresh, et al.
Published: (2025)
Similar Items
-
Multi-Mixer Models: Flexible Sequence Modeling with Shared Representations
by: Li, Kevin Y., et al.
Published: (2026) -
The importance of feature preprocessing for differentially private linear optimization
by: Sun, Ziteng, et al.
Published: (2023) -
Subset-Based Instance Optimality in Private Estimation
by: Dick, Travis, et al.
Published: (2023) -
Private federated discovery of out-of-vocabulary words for Gboard
by: Sun, Ziteng, et al.
Published: (2024) -
Exploring and Improving Drafts in Blockwise Parallel Decoding
by: Kim, Taehyeon, et al.
Published: (2024)