Saved in:
| Main Authors: | Ruiz, Tomas, Qin, Zhen, Zhang, Yifan, Shen, Xuyang, Zhong, Yiran, Wang, Mengdi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.15854 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Higher-order Linear Attention
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
MolMem: Memory-Augmented Agentic Reinforcement Learning for Sample-Efficient Molecular Optimization
by: Wang, Ziqing, et al.
Published: (2026)
by: Wang, Ziqing, et al.
Published: (2026)
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
by: Qiu, Jiahao, et al.
Published: (2024)
by: Qiu, Jiahao, et al.
Published: (2024)
AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning
by: Yang, Yifan, et al.
Published: (2024)
by: Yang, Yifan, et al.
Published: (2024)
Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles
by: Wei, Qingyan, et al.
Published: (2025)
by: Wei, Qingyan, et al.
Published: (2025)
OpenClaw-RL: Train Any Agent Simply by Talking
by: Wang, Yinjie, et al.
Published: (2026)
by: Wang, Yinjie, et al.
Published: (2026)
Sample-Efficient Alignment for LLMs
by: Liu, Zichen, et al.
Published: (2024)
by: Liu, Zichen, et al.
Published: (2024)
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
by: Pan, Rui, et al.
Published: (2024)
by: Pan, Rui, et al.
Published: (2024)
MemRerank: Preference Memory for Personalized Product Reranking
by: Peng, Zhiyuan, et al.
Published: (2026)
by: Peng, Zhiyuan, et al.
Published: (2026)
Deep Delta Learning
by: Zhang, Yifan, et al.
Published: (2026)
by: Zhang, Yifan, et al.
Published: (2026)
Active Preference Optimization for Sample Efficient RLHF
by: Das, Nirjhar, et al.
Published: (2024)
by: Das, Nirjhar, et al.
Published: (2024)
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
by: Qin, Zhen, et al.
Published: (2024)
by: Qin, Zhen, et al.
Published: (2024)
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
by: Yao, Jiarui, et al.
Published: (2025)
by: Yao, Jiarui, et al.
Published: (2025)
Efficiently Dispatching Flash Attention For Partially Filled Attention Masks
by: Sharma, Agniv, et al.
Published: (2024)
by: Sharma, Agniv, et al.
Published: (2024)
A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement
by: Yuan, Hui, et al.
Published: (2024)
by: Yuan, Hui, et al.
Published: (2024)
Replay Failures as Successes: Sample-Efficient Reinforcement Learning for Instruction Following
by: Zhang, Kongcheng, et al.
Published: (2025)
by: Zhang, Kongcheng, et al.
Published: (2025)
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling
by: Lipkin, Benjamin, et al.
Published: (2025)
by: Lipkin, Benjamin, et al.
Published: (2025)
Hallucination Detection in LLMs: Fast and Memory-Efficient Fine-Tuned Models
by: Arteaga, Gabriel Y., et al.
Published: (2024)
by: Arteaga, Gabriel Y., et al.
Published: (2024)
Beyond Speedup -- Utilizing KV Cache for Sampling and Reasoning
by: Xing, Zeyu, et al.
Published: (2026)
by: Xing, Zeyu, et al.
Published: (2026)
PRISM: Parametrically Refactoring Inference for Speculative Sampling Draft Models
by: Wang, Xuliang, et al.
Published: (2026)
by: Wang, Xuliang, et al.
Published: (2026)
Iterative Deepening Sampling as Efficient Test-Time Scaling
by: Chen, Weizhe, et al.
Published: (2025)
by: Chen, Weizhe, et al.
Published: (2025)
STEM: Efficient Relative Capability Evaluation of LLMs through Structured Transition Samples
by: Hu, Haiquan, et al.
Published: (2025)
by: Hu, Haiquan, et al.
Published: (2025)
Linear Attention Sequence Parallelism
by: Sun, Weigao, et al.
Published: (2024)
by: Sun, Weigao, et al.
Published: (2024)
A Semantic-Sampling Framework for Evaluating Calibration in Open-Ended Question Answering
by: Wang, Zhanliang, et al.
Published: (2026)
by: Wang, Zhanliang, et al.
Published: (2026)
DynScaling: Efficient Verifier-free Inference Scaling via Dynamic and Integrated Sampling
by: Wang, Fei, et al.
Published: (2025)
by: Wang, Fei, et al.
Published: (2025)
On the Overscaling Curse of Parallel Thinking: System Efficacy Contradicts Sample Efficiency
by: Wang, Yiming, et al.
Published: (2026)
by: Wang, Yiming, et al.
Published: (2026)
FlashSpeech: Efficient Zero-Shot Speech Synthesis
by: Ye, Zhen, et al.
Published: (2024)
by: Ye, Zhen, et al.
Published: (2024)
Sample-aware Adaptive Structured Pruning for Large Language Models
by: Kong, Jun, et al.
Published: (2025)
by: Kong, Jun, et al.
Published: (2025)
SpecDec++: Boosting Speculative Decoding via Adaptive Candidate Lengths
by: Huang, Kaixuan, et al.
Published: (2024)
by: Huang, Kaixuan, et al.
Published: (2024)
SLOT: Sample-specific Language Model Optimization at Test-time
by: Hu, Yang, et al.
Published: (2025)
by: Hu, Yang, et al.
Published: (2025)
LAMPO: Large Language Models as Preference Machines for Few-shot Ordinal Classification
by: Qin, Zhen, et al.
Published: (2024)
by: Qin, Zhen, et al.
Published: (2024)
Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF
by: Xie, Tengyang, et al.
Published: (2024)
by: Xie, Tengyang, et al.
Published: (2024)
Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting
by: Hu, Michael Y., et al.
Published: (2025)
by: Hu, Michael Y., et al.
Published: (2025)
Federated Data-Efficient Instruction Tuning for Large Language Models
by: Qin, Zhen, et al.
Published: (2024)
by: Qin, Zhen, et al.
Published: (2024)
Interactive Benchmarks
by: Yue, Baoqing, et al.
Published: (2026)
by: Yue, Baoqing, et al.
Published: (2026)
GeRe: Towards Efficient Anti-Forgetting in Continual Learning of LLM via General Samples Replay
by: Zhang, Yunan, et al.
Published: (2025)
by: Zhang, Yunan, et al.
Published: (2025)
OccamLLM: Fast and Exact Language Model Arithmetic in a Single Step
by: Dugan, Owen, et al.
Published: (2024)
by: Dugan, Owen, et al.
Published: (2024)
Mining Intrinsic Rewards from LLM Hidden States for Efficient Best-of-N Sampling
by: Guo, Jizhou, et al.
Published: (2025)
by: Guo, Jizhou, et al.
Published: (2025)
To be Continuous, or to be Discrete, Those are Bits of Questions
by: Wang, Yiran, et al.
Published: (2024)
by: Wang, Yiran, et al.
Published: (2024)
On Eliciting Syntax from Language Models via Hashing
by: Wang, Yiran, et al.
Published: (2024)
by: Wang, Yiran, et al.
Published: (2024)
Similar Items
-
Higher-order Linear Attention
by: Zhang, Yifan, et al.
Published: (2025) -
MolMem: Memory-Augmented Agentic Reinforcement Learning for Sample-Efficient Molecular Optimization
by: Wang, Ziqing, et al.
Published: (2026) -
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling
by: Qiu, Jiahao, et al.
Published: (2024) -
AdaZeta: Adaptive Zeroth-Order Tensor-Train Adaption for Memory-Efficient Large Language Models Fine-Tuning
by: Yang, Yifan, et al.
Published: (2024) -
Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles
by: Wei, Qingyan, et al.
Published: (2025)