Saved in:
| Main Authors: | Song, Woomin, Dingliwal, Saket, Jayanthi, Sai Muralidhar, Ganesh, Bhavana, Shin, Jinwoo, Galstyan, Aram, Bodapati, Sravan Babu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.04708 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Think Clearly: Improving Reasoning via Redundant Token Pruning
by: Choi, Daewon, et al.
Published: (2025)
by: Choi, Daewon, et al.
Published: (2025)
IdleSpec: Exploiting Idle Time via Speculative Planning for LLM Agents
by: Choi, Daewon, et al.
Published: (2026)
by: Choi, Daewon, et al.
Published: (2026)
ExComm: Exploration-Stage Communication for Error-Resilient Agentic Test-Time Scaling
by: Song, Woomin, et al.
Published: (2026)
by: Song, Woomin, et al.
Published: (2026)
Compress, Gather, and Recompute: REFORMing Long-Context Processing in Transformers
by: Song, Woomin, et al.
Published: (2025)
by: Song, Woomin, et al.
Published: (2025)
Mamba Drafters for Speculative Decoding
by: Choi, Daewon, et al.
Published: (2025)
by: Choi, Daewon, et al.
Published: (2025)
SeRA: Self-Reviewing and Alignment of Large Language Models using Implicit Reward Margins
by: Ko, Jongwoo, et al.
Published: (2024)
by: Ko, Jongwoo, et al.
Published: (2024)
Adaptive Video Understanding Agent: Enhancing efficiency with dynamic frame sampling and feedback-driven reasoning
by: Jeoung, Sullam, et al.
Published: (2024)
by: Jeoung, Sullam, et al.
Published: (2024)
SpeechVerse: A Large-scale Generalizable Audio Language Model
by: Das, Nilaksh, et al.
Published: (2024)
by: Das, Nilaksh, et al.
Published: (2024)
Wanda++: Pruning Large Language Models via Regional Gradients
by: Yang, Yifan, et al.
Published: (2025)
by: Yang, Yifan, et al.
Published: (2025)
Context Length Alone Hurts LLM Performance Despite Perfect Retrieval
by: Du, Yufeng, et al.
Published: (2025)
by: Du, Yufeng, et al.
Published: (2025)
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models
by: Peri, Raghuveer, et al.
Published: (2024)
by: Peri, Raghuveer, et al.
Published: (2024)
Sequential Editing for Lifelong Training of Speech Recognition Models
by: Kulshreshtha, Devang, et al.
Published: (2024)
by: Kulshreshtha, Devang, et al.
Published: (2024)
Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization
by: Xu, Lei, et al.
Published: (2024)
by: Xu, Lei, et al.
Published: (2024)
ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models
by: Elangovan, Aparna, et al.
Published: (2024)
by: Elangovan, Aparna, et al.
Published: (2024)
Document Haystack: A Long Context Multimodal Image/Document Understanding Vision LLM Benchmark
by: Huybrechts, Goeric, et al.
Published: (2025)
by: Huybrechts, Goeric, et al.
Published: (2025)
Tabular Transfer Learning via Prompting LLMs
by: Nam, Jaehyun, et al.
Published: (2024)
by: Nam, Jaehyun, et al.
Published: (2024)
Sparsified State-Space Models are Efficient Highway Networks
by: Song, Woomin, et al.
Published: (2025)
by: Song, Woomin, et al.
Published: (2025)
Test-Time Speculation
by: Kumar, Avinash, et al.
Published: (2026)
by: Kumar, Avinash, et al.
Published: (2026)
KG-LLM-Bench: A Scalable Benchmark for Evaluating LLM Reasoning on Textualized Knowledge Graphs
by: Markowitz, Elan, et al.
Published: (2025)
by: Markowitz, Elan, et al.
Published: (2025)
Optimized Speculative Sampling for GPU Hardware Accelerators
by: Wagner, Dominik, et al.
Published: (2024)
by: Wagner, Dominik, et al.
Published: (2024)
Reward-Shifted Speculative Sampling Is An Efficient Test-Time Weak-to-Strong Aligner
by: Li, Bolian, et al.
Published: (2025)
by: Li, Bolian, et al.
Published: (2025)
LAWCAT: Efficient Distillation from Quadratic to Linear Attention with Convolution across Tokens for Long Context Modeling
by: Liu, Zeyu, et al.
Published: (2025)
by: Liu, Zeyu, et al.
Published: (2025)
ReVISE: Learning to Refine at Test-Time via Intrinsic Self-Verification
by: Lee, Hyunseok, et al.
Published: (2025)
by: Lee, Hyunseok, et al.
Published: (2025)
RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models
by: Kim, Dongyoung, et al.
Published: (2026)
by: Kim, Dongyoung, et al.
Published: (2026)
Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
by: Wang, Fei, et al.
Published: (2024)
by: Wang, Fei, et al.
Published: (2024)
$\texttt{SPECS}$: Faster Test-Time Scaling through Speculative Drafts
by: Cemri, Mert, et al.
Published: (2025)
by: Cemri, Mert, et al.
Published: (2025)
Speculate Deep and Accurate: Lossless and Training-Free Acceleration for Offloaded LLMs via Substitute Speculative Decoding
by: Wang, Pei-Shuo, et al.
Published: (2025)
by: Wang, Pei-Shuo, et al.
Published: (2025)
Dynamic Speculation Lookahead Accelerates Speculative Decoding of Large Language Models
by: Mamou, Jonathan, et al.
Published: (2024)
by: Mamou, Jonathan, et al.
Published: (2024)
Training Text-to-Molecule Models with Context-Aware Tokenization
by: Kim, Seojin, et al.
Published: (2025)
by: Kim, Seojin, et al.
Published: (2025)
Prompt Perturbation Consistency Learning for Robust Language Models
by: Qiang, Yao, et al.
Published: (2024)
by: Qiang, Yao, et al.
Published: (2024)
Scaling Up, Speeding Up: A Benchmark of Speculative Decoding for Efficient LLM Test-Time Scaling
by: Sun, Shengyin, et al.
Published: (2025)
by: Sun, Shengyin, et al.
Published: (2025)
Synthetic Multimodal Question Generation
by: Wu, Ian, et al.
Published: (2024)
by: Wu, Ian, et al.
Published: (2024)
Personalized Language Models via Privacy-Preserving Evolutionary Model Merging
by: Kim, Kyuyoung, et al.
Published: (2025)
by: Kim, Kyuyoung, et al.
Published: (2025)
Recursive Chain-of-Feedback Prevents Performance Degradation from Redundant Prompting
by: Ahn, Jinwoo, et al.
Published: (2024)
by: Ahn, Jinwoo, et al.
Published: (2024)
Efficient Adaptive Rejection Sampling for Accelerating Speculative Decoding in Large Language Models
by: Sun, Chendong, et al.
Published: (2025)
by: Sun, Chendong, et al.
Published: (2025)
Self-Refining Language Model Anonymizers via Adversarial Distillation
by: Kim, Kyuyoung, et al.
Published: (2025)
by: Kim, Kyuyoung, et al.
Published: (2025)
ReMoDetect: Reward Models Recognize Aligned LLM's Generations
by: Lee, Hyunseok, et al.
Published: (2024)
by: Lee, Hyunseok, et al.
Published: (2024)
Scaling Laws for Speculative Decoding
by: Yan, Siyuan, et al.
Published: (2025)
by: Yan, Siyuan, et al.
Published: (2025)
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling
by: Zhao, Weilin, et al.
Published: (2025)
by: Zhao, Weilin, et al.
Published: (2025)
Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
by: Meng, Tao, et al.
Published: (2024)
by: Meng, Tao, et al.
Published: (2024)
Similar Items
-
Think Clearly: Improving Reasoning via Redundant Token Pruning
by: Choi, Daewon, et al.
Published: (2025) -
IdleSpec: Exploiting Idle Time via Speculative Planning for LLM Agents
by: Choi, Daewon, et al.
Published: (2026) -
ExComm: Exploration-Stage Communication for Error-Resilient Agentic Test-Time Scaling
by: Song, Woomin, et al.
Published: (2026) -
Compress, Gather, and Recompute: REFORMing Long-Context Processing in Transformers
by: Song, Woomin, et al.
Published: (2025) -
Mamba Drafters for Speculative Decoding
by: Choi, Daewon, et al.
Published: (2025)