Saved in:
| Main Authors: | Cai, Min, Zhang, Yuchen, Zhang, Shichang, Yin, Fan, Zhang, Dan, Zou, Difan, Yue, Yisong, Hu, Ziniu |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.02721 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Strategist: Self-improvement of LLM Decision Making via Bi-Level Tree Search
by: Light, Jonathan, et al.
Published: (2024)
by: Light, Jonathan, et al.
Published: (2024)
DataSciBench: An LLM Agent Benchmark for Data Science
by: Zhang, Dan, et al.
Published: (2025)
by: Zhang, Dan, et al.
Published: (2025)
TDRM: Smooth Reward Models with Temporal Difference for LLM RL and Inference
by: Zhang, Dan, et al.
Published: (2025)
by: Zhang, Dan, et al.
Published: (2025)
Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?
by: Mu, Junjie, et al.
Published: (2025)
by: Mu, Junjie, et al.
Published: (2025)
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
by: Zhang, Dan, et al.
Published: (2024)
by: Zhang, Dan, et al.
Published: (2024)
Retention analysis of edited knowledge after fine-tuning
by: Wen, Fufang, et al.
Published: (2025)
by: Wen, Fufang, et al.
Published: (2025)
SAM Decoding: Speculative Decoding via Suffix Automaton
by: Hu, Yuxuan, et al.
Published: (2024)
by: Hu, Yuxuan, et al.
Published: (2024)
Breaking Contextual Inertia: Reinforcement Learning with Single-Turn Anchors for Stable Multi-Turn Interaction
by: Chen, Xingwu, et al.
Published: (2026)
by: Chen, Xingwu, et al.
Published: (2026)
SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code
by: Hu, Ziniu, et al.
Published: (2024)
by: Hu, Ziniu, et al.
Published: (2024)
Editing Knowledge Representation of Language Model via Rephrased Prefix Prompts
by: Cai, Yuchen, et al.
Published: (2024)
by: Cai, Yuchen, et al.
Published: (2024)
Turning Internal Gap into Self-Improvement: Promoting the Generation-Understanding Unification in MLLMs
by: Han, Yujin, et al.
Published: (2025)
by: Han, Yujin, et al.
Published: (2025)
PrefixMemory-Tuning: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention
by: Wang, Haonan, et al.
Published: (2025)
by: Wang, Haonan, et al.
Published: (2025)
Interpreting and Controlling LLM Reasoning through Integrated Policy Gradient
by: Li, Changming, et al.
Published: (2026)
by: Li, Changming, et al.
Published: (2026)
Linear-Time Demonstration Selection for In-Context Learning via Gradient Estimation
by: Zhang, Ziniu, et al.
Published: (2025)
by: Zhang, Ziniu, et al.
Published: (2025)
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
by: Wang, Xiaoxuan, et al.
Published: (2023)
by: Wang, Xiaoxuan, et al.
Published: (2023)
Well Begun, Half Done: Reinforcement Learning with Prefix Optimization for LLM Reasoning
by: Sun, Yiliu, et al.
Published: (2025)
by: Sun, Yiliu, et al.
Published: (2025)
A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models
by: Xie, Chengxing, et al.
Published: (2024)
by: Xie, Chengxing, et al.
Published: (2024)
Differentiable Evolutionary Reinforcement Learning
by: Cheng, Sitao, et al.
Published: (2025)
by: Cheng, Sitao, et al.
Published: (2025)
WASD: Locating Critical Neurons as Sufficient Conditions for Explaining and Controlling LLM Behavior
by: Yu, Haonan, et al.
Published: (2026)
by: Yu, Haonan, et al.
Published: (2026)
Reveal and Release: Iterative LLM Unlearning with Self-generated Data
by: Xie, Linxi, et al.
Published: (2025)
by: Xie, Linxi, et al.
Published: (2025)
Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model
by: Zhan, Runzhe, et al.
Published: (2024)
by: Zhan, Runzhe, et al.
Published: (2024)
From Prefix Cache to Fusion RAG Cache: Accelerating LLM Inference in Retrieval-Augmented Generation
by: Wang, Jiahao, et al.
Published: (2026)
by: Wang, Jiahao, et al.
Published: (2026)
Beyond Numeric Rewards: In-Context Dueling Bandits with LLM Agents
by: Xia, Fanzeng, et al.
Published: (2024)
by: Xia, Fanzeng, et al.
Published: (2024)
Does higher interpretability imply better utility? A Pairwise Analysis on Sparse Autoencoders
by: Wang, Xu, et al.
Published: (2025)
by: Wang, Xu, et al.
Published: (2025)
How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence
by: Du, Hongzhe, et al.
Published: (2025)
by: Du, Hongzhe, et al.
Published: (2025)
Path-Consistency with Prefix Enhancement for Efficient Inference in LLMs
by: Zhu, Jiace, et al.
Published: (2024)
by: Zhu, Jiace, et al.
Published: (2024)
Controlled Self-Evolution for Algorithmic Code Optimization
by: Hu, Tu, et al.
Published: (2026)
by: Hu, Tu, et al.
Published: (2026)
Detecting Prefix Bias in LLM-based Reward Models
by: Kumar, Ashwin, et al.
Published: (2025)
by: Kumar, Ashwin, et al.
Published: (2025)
CtrlCoT: Dual-Granularity Chain-of-Thought Compression for Controllable Reasoning
by: Fan, Zhenxuan, et al.
Published: (2026)
by: Fan, Zhenxuan, et al.
Published: (2026)
On the Robustness of Transformers against Context Hijacking for Linear Classification
by: Li, Tianle, et al.
Published: (2025)
by: Li, Tianle, et al.
Published: (2025)
Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming
by: Liu, Jiaxu, et al.
Published: (2024)
by: Liu, Jiaxu, et al.
Published: (2024)
Dynamic Long Context Reasoning over Compressed Memory via End-to-End Reinforcement Learning
by: Chen, Zhuoen, et al.
Published: (2026)
by: Chen, Zhuoen, et al.
Published: (2026)
The Path of Least Resistance: Guiding LLM Reasoning Trajectories with Prefix Consensus
by: Jindal, Ishan, et al.
Published: (2026)
by: Jindal, Ishan, et al.
Published: (2026)
CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models
by: Li, Dongfang, et al.
Published: (2024)
by: Li, Dongfang, et al.
Published: (2024)
MIRAI: Evaluating LLM Agents for Event Forecasting
by: Ye, Chenchen, et al.
Published: (2024)
by: Ye, Chenchen, et al.
Published: (2024)
Event Causality Identification with Synthetic Control
by: Wang, Haoyu, et al.
Published: (2025)
by: Wang, Haoyu, et al.
Published: (2025)
ERC-SVD: Error-Controlled SVD for Large Language Model Compression
by: Bai, Haolei, et al.
Published: (2025)
by: Bai, Haolei, et al.
Published: (2025)
Semantic Voting: A Self-Evaluation-Free Approach for Efficient LLM Self-Improvement on Unverifiable Open-ended Tasks
by: Jiang, Chunyang, et al.
Published: (2025)
by: Jiang, Chunyang, et al.
Published: (2025)
Understanding the Generalization of Stochastic Gradient Adam in Learning Neural Networks
by: Tang, Xuan, et al.
Published: (2025)
by: Tang, Xuan, et al.
Published: (2025)
AdvPrefix: An Objective for Nuanced LLM Jailbreaks
by: Zhu, Sicheng, et al.
Published: (2024)
by: Zhu, Sicheng, et al.
Published: (2024)
Similar Items
-
Strategist: Self-improvement of LLM Decision Making via Bi-Level Tree Search
by: Light, Jonathan, et al.
Published: (2024) -
DataSciBench: An LLM Agent Benchmark for Data Science
by: Zhang, Dan, et al.
Published: (2025) -
TDRM: Smooth Reward Models with Temporal Difference for LLM RL and Inference
by: Zhang, Dan, et al.
Published: (2025) -
Mask-GCG: Are All Tokens in Adversarial Suffixes Necessary for Jailbreak Attacks?
by: Mu, Junjie, et al.
Published: (2025) -
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
by: Zhang, Dan, et al.
Published: (2024)