Saved in:
| Main Authors: | Tan, Zhen, Dong, Daize, Zhao, Xinyu, Peng, Jie, Cheng, Yu, Chen, Tianlong |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.11030 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion
by: Tan, Zhen, et al.
Published: (2026)
by: Tan, Zhen, et al.
Published: (2026)
Can GRPO Help LLMs Transcend Their Pretraining Origin?
by: Ni, Kangqi, et al.
Published: (2025)
by: Ni, Kangqi, et al.
Published: (2025)
QuantMoE-Bench: Examining Post-Training Quantization for Mixture-of-Experts
by: Li, Pingzhi, et al.
Published: (2024)
by: Li, Pingzhi, et al.
Published: (2024)
Leave It to the Experts: Detecting Knowledge Distillation via MoE Expert Signatures
by: Li, Pingzhi, et al.
Published: (2025)
by: Li, Pingzhi, et al.
Published: (2025)
DOGe: Defensive Output Generation for LLM Protection Against Knowledge Distillation
by: Li, Pingzhi, et al.
Published: (2025)
by: Li, Pingzhi, et al.
Published: (2025)
Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques
by: He, Shwai, et al.
Published: (2024)
by: He, Shwai, et al.
Published: (2024)
DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing
by: Ma, Xinyu, et al.
Published: (2025)
by: Ma, Xinyu, et al.
Published: (2025)
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
by: Li, Pingzhi, et al.
Published: (2023)
by: Li, Pingzhi, et al.
Published: (2023)
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
by: Zhao, Xinyu, et al.
Published: (2024)
by: Zhao, Xinyu, et al.
Published: (2024)
Dr.LLM: Dynamic Layer Routing in LLMs
by: Heakl, Ahmed, et al.
Published: (2025)
by: Heakl, Ahmed, et al.
Published: (2025)
GraphRCG: Self-Conditioned Graph Generation
by: Wang, Song, et al.
Published: (2024)
by: Wang, Song, et al.
Published: (2024)
$\textit{Agents Under Siege}$: Breaking Pragmatic Multi-Agent LLM Systems with Optimized Prompt Attacks
by: Khan, Rana Muhammad Shahroz, et al.
Published: (2025)
by: Khan, Rana Muhammad Shahroz, et al.
Published: (2025)
UProp: Investigating the Uncertainty Propagation of LLMs in Multi-Step Agentic Decision-Making
by: Duan, Jinhao, et al.
Published: (2025)
by: Duan, Jinhao, et al.
Published: (2025)
From Pruning to Grafting: Dynamic Knowledge Redistribution via Learnable Layer Fusion
by: Pei, Zehua, et al.
Published: (2024)
by: Pei, Zehua, et al.
Published: (2024)
LESA: Learnable LLM Layer Scaling-Up
by: Yang, Yifei, et al.
Published: (2025)
by: Yang, Yifei, et al.
Published: (2025)
Not All Layers of LLMs Are Necessary During Inference
by: Fan, Siqi, et al.
Published: (2024)
by: Fan, Siqi, et al.
Published: (2024)
Finding the Cracks: Improving LLMs Reasoning with Paraphrastic Probing and Consistency Verification
by: Shi, Weili, et al.
Published: (2026)
by: Shi, Weili, et al.
Published: (2026)
Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach
by: Tan, Zhen, et al.
Published: (2024)
by: Tan, Zhen, et al.
Published: (2024)
Measuring Real-World Prompt Injection Attacks in LLM-based Resume Screening
by: Zhang, Mohan, et al.
Published: (2026)
by: Zhang, Mohan, et al.
Published: (2026)
DynScaling: Efficient Verifier-free Inference Scaling via Dynamic and Integrated Sampling
by: Wang, Fei, et al.
Published: (2025)
by: Wang, Fei, et al.
Published: (2025)
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
by: Yue, Murong, et al.
Published: (2024)
by: Yue, Murong, et al.
Published: (2024)
Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
by: Zhang, Mohan, et al.
Published: (2025)
by: Zhang, Mohan, et al.
Published: (2025)
How to Train Data-Efficient LLMs
by: Sachdeva, Noveen, et al.
Published: (2024)
by: Sachdeva, Noveen, et al.
Published: (2024)
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
by: Zhao, Chengshuai, et al.
Published: (2025)
by: Zhao, Chengshuai, et al.
Published: (2025)
GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations
by: Duan, Jinhao, et al.
Published: (2024)
by: Duan, Jinhao, et al.
Published: (2024)
R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning
by: Chen, Zhuokun, et al.
Published: (2025)
by: Chen, Zhuokun, et al.
Published: (2025)
KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing
by: Yang, Yifei, et al.
Published: (2024)
by: Yang, Yifei, et al.
Published: (2024)
PoTPTQ: A Two-step Power-of-Two Post-training for LLMs
by: Wang, Xinyu, et al.
Published: (2025)
by: Wang, Xinyu, et al.
Published: (2025)
Steer2Adapt: Dynamically Composing Steering Vectors Elicits Efficient Adaptation of LLMs
by: Han, Pengrui, et al.
Published: (2026)
by: Han, Pengrui, et al.
Published: (2026)
Towards a Comprehensive Scaling Law of Mixture-of-Experts
by: Zhao, Guoliang, et al.
Published: (2025)
by: Zhao, Guoliang, et al.
Published: (2025)
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders
by: Hu, Yuezhou, et al.
Published: (2025)
by: Hu, Yuezhou, et al.
Published: (2025)
The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative
by: Tan, Zhen, et al.
Published: (2024)
by: Tan, Zhen, et al.
Published: (2024)
PiCO: Peer Review in LLMs based on the Consistency Optimization
by: Ning, Kun-Peng, et al.
Published: (2024)
by: Ning, Kun-Peng, et al.
Published: (2024)
Sample-Efficient Alignment for LLMs
by: Liu, Zichen, et al.
Published: (2024)
by: Liu, Zichen, et al.
Published: (2024)
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
by: Tan, Shawn, et al.
Published: (2024)
by: Tan, Shawn, et al.
Published: (2024)
OPT-Engine: Benchmarking the Limits of LLMs in Optimization Modeling via Complexity Scaling
by: Chen, Yitian, et al.
Published: (2026)
by: Chen, Yitian, et al.
Published: (2026)
A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
by: Gao, Zhangyang, et al.
Published: (2024)
by: Gao, Zhangyang, et al.
Published: (2024)
AutoScale: Scale-Aware Data Mixing for Pre-Training LLMs
by: Kang, Feiyang, et al.
Published: (2024)
by: Kang, Feiyang, et al.
Published: (2024)
On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
by: Fan, Chenghao, et al.
Published: (2024)
by: Fan, Chenghao, et al.
Published: (2024)
Beyond Redundancy: Diverse and Specialized Multi-Expert Sparse Autoencoder
by: Xu, Zhen, et al.
Published: (2025)
by: Xu, Zhen, et al.
Published: (2025)
Similar Items
-
Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion
by: Tan, Zhen, et al.
Published: (2026) -
Can GRPO Help LLMs Transcend Their Pretraining Origin?
by: Ni, Kangqi, et al.
Published: (2025) -
QuantMoE-Bench: Examining Post-Training Quantization for Mixture-of-Experts
by: Li, Pingzhi, et al.
Published: (2024) -
Leave It to the Experts: Detecting Knowledge Distillation via MoE Expert Signatures
by: Li, Pingzhi, et al.
Published: (2025) -
DOGe: Defensive Output Generation for LLM Protection Against Knowledge Distillation
by: Li, Pingzhi, et al.
Published: (2025)