Guardado en:
| Autores principales: | Ye, Haotian, Jain, Himanshu, You, Chong, Suresh, Ananda Theertha, Lin, Haowei, Zou, James, Yu, Felix |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2504.09135 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
SpecTr: Fast Speculative Decoding via Optimal Transport
por: Sun, Ziteng, et al.
Publicado: (2023)
por: Sun, Ziteng, et al.
Publicado: (2023)
Coupling without Communication and Drafter-Invariant Speculative Decoding
por: Daliri, Majid, et al.
Publicado: (2024)
por: Daliri, Majid, et al.
Publicado: (2024)
Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe
por: You, Chong, et al.
Publicado: (2025)
por: You, Chong, et al.
Publicado: (2025)
Exploring and Improving Drafts in Blockwise Parallel Decoding
por: Kim, Taehyeon, et al.
Publicado: (2024)
por: Kim, Taehyeon, et al.
Publicado: (2024)
Block Verification Accelerates Speculative Decoding
por: Sun, Ziteng, et al.
Publicado: (2024)
por: Sun, Ziteng, et al.
Publicado: (2024)
Generative Evaluation of Complex Reasoning in Large Language Models
por: Lin, Haowei, et al.
Publicado: (2025)
por: Lin, Haowei, et al.
Publicado: (2025)
CoDistill-GRPO: A Co-Distillation Recipe for Efficient Group Relative Policy Optimization
por: Kwon, Soo Min, et al.
Publicado: (2026)
por: Kwon, Soo Min, et al.
Publicado: (2026)
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
por: Lin, Haowei, et al.
Publicado: (2024)
por: Lin, Haowei, et al.
Publicado: (2024)
Asymptotics of Language Model Alignment
por: Yang, Joy Qiping, et al.
Publicado: (2024)
por: Yang, Joy Qiping, et al.
Publicado: (2024)
Efficient Language Model Architectures for Differentially Private Federated Learning
por: Ro, Jae Hun, et al.
Publicado: (2024)
por: Ro, Jae Hun, et al.
Publicado: (2024)
Theoretical guarantees on the best-of-n alignment policy
por: Beirami, Ahmad, et al.
Publicado: (2024)
por: Beirami, Ahmad, et al.
Publicado: (2024)
Can Language Models Discover Scaling Laws?
por: Lin, Haowei, et al.
Publicado: (2025)
por: Lin, Haowei, et al.
Publicado: (2025)
Improved Unbiased Watermark for Large Language Models
por: Chen, Ruibo, et al.
Publicado: (2025)
por: Chen, Ruibo, et al.
Publicado: (2025)
Conceptual and Unbiased Reasoning in Language Models
por: Zhou, Ben, et al.
Publicado: (2024)
por: Zhou, Ben, et al.
Publicado: (2024)
DSCD: Large Language Model Detoxification with Self-Constrained Decoding
por: Dong, Ming, et al.
Publicado: (2025)
por: Dong, Ming, et al.
Publicado: (2025)
BiMark: Unbiased Multilayer Watermarking for Large Language Models
por: Feng, Xiaoyan, et al.
Publicado: (2025)
por: Feng, Xiaoyan, et al.
Publicado: (2025)
Measuring Implicit Bias in Explicitly Unbiased Large Language Models
por: Bai, Xuechunzi, et al.
Publicado: (2024)
por: Bai, Xuechunzi, et al.
Publicado: (2024)
CSV-Decode: Certifiable Sub-Vocabulary Decoding for Efficient Large Language Model Inference
por: Liu, Dong, et al.
Publicado: (2025)
por: Liu, Dong, et al.
Publicado: (2025)
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
por: You, Haoran, et al.
Publicado: (2024)
por: You, Haoran, et al.
Publicado: (2024)
FAEDKV: Infinite-Window Fourier Transform for Unbiased KV Cache Compression
por: Li, Runchao, et al.
Publicado: (2025)
por: Li, Runchao, et al.
Publicado: (2025)
Recurrent Drafter for Fast Speculative Decoding in Large Language Models
por: Cheng, Yunfei, et al.
Publicado: (2024)
por: Cheng, Yunfei, et al.
Publicado: (2024)
Latent Distance Guided Alignment Training for Large Language Models
por: Luo, Haotian
Publicado: (2024)
por: Luo, Haotian
Publicado: (2024)
Thinking Before Constraining: A Unified Decoding Framework for Large Language Models
por: Nguyen, Ngoc Trinh Hung, et al.
Publicado: (2026)
por: Nguyen, Ngoc Trinh Hung, et al.
Publicado: (2026)
AdaSpec: Adaptive Speculative Decoding for Fast, SLO-Aware Large Language Model Serving
por: Huang, Kaiyu, et al.
Publicado: (2025)
por: Huang, Kaiyu, et al.
Publicado: (2025)
Breaking Block Boundaries: Anchor-based History-stable Decoding for Diffusion Large Language Models
por: Zou, Shun, et al.
Publicado: (2026)
por: Zou, Shun, et al.
Publicado: (2026)
Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access
por: Geng, Saibo, et al.
Publicado: (2024)
por: Geng, Saibo, et al.
Publicado: (2024)
On Robust Hypothesis Testing with respect to the Hellinger Distance
por: Modak, Eeshan, et al.
Publicado: (2025)
por: Modak, Eeshan, et al.
Publicado: (2025)
Mean estimation in the add-remove model of differential privacy
por: Kulesza, Alex, et al.
Publicado: (2023)
por: Kulesza, Alex, et al.
Publicado: (2023)
Comparison of Scoring Rationales Between Large Language Models and Human Raters
por: Hua, Haowei, et al.
Publicado: (2025)
por: Hua, Haowei, et al.
Publicado: (2025)
Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding
por: Tu, Lifu, et al.
Publicado: (2023)
por: Tu, Lifu, et al.
Publicado: (2023)
Flexible and Efficient Grammar-Constrained Decoding
por: Park, Kanghee, et al.
Publicado: (2025)
por: Park, Kanghee, et al.
Publicado: (2025)
Watermarking Low-entropy Generation for Large Language Models: An Unbiased and Low-risk Method
por: Mao, Minjia, et al.
Publicado: (2024)
por: Mao, Minjia, et al.
Publicado: (2024)
MemeArena: Automating Context-Aware Unbiased Evaluation of Harmfulness Understanding for Multimodal Large Language Models
por: Chen, Zixin, et al.
Publicado: (2025)
por: Chen, Zixin, et al.
Publicado: (2025)
Sparse Reward Subsystem in Large Language Models
por: Xu, Guowei, et al.
Publicado: (2026)
por: Xu, Guowei, et al.
Publicado: (2026)
Plato: Plan to Efficiently Decode for Large Language Model Inference
por: Jin, Shuowei, et al.
Publicado: (2024)
por: Jin, Shuowei, et al.
Publicado: (2024)
Adaptive Draft-Verification for Efficient Large Language Model Decoding
por: Liu, Xukun, et al.
Publicado: (2024)
por: Liu, Xukun, et al.
Publicado: (2024)
SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models
por: Wu, Yi, et al.
Publicado: (2024)
por: Wu, Yi, et al.
Publicado: (2024)
CreditDecoding: Accelerating Parallel Decoding in Diffusion Large Language Models with Trace Credit
por: Wang, Kangyu, et al.
Publicado: (2025)
por: Wang, Kangyu, et al.
Publicado: (2025)
InfAlign: Inference-aware language model alignment
por: Balashankar, Ananth, et al.
Publicado: (2024)
por: Balashankar, Ananth, et al.
Publicado: (2024)
Efficient Beam Search for Large Language Models Using Trie-Based Decoding
por: Chan, Brian J, et al.
Publicado: (2025)
por: Chan, Brian J, et al.
Publicado: (2025)
Ejemplares similares
-
SpecTr: Fast Speculative Decoding via Optimal Transport
por: Sun, Ziteng, et al.
Publicado: (2023) -
Coupling without Communication and Drafter-Invariant Speculative Decoding
por: Daliri, Majid, et al.
Publicado: (2024) -
Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe
por: You, Chong, et al.
Publicado: (2025) -
Exploring and Improving Drafts in Blockwise Parallel Decoding
por: Kim, Taehyeon, et al.
Publicado: (2024) -
Block Verification Accelerates Speculative Decoding
por: Sun, Ziteng, et al.
Publicado: (2024)