:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Tan, Zhen, Dong, Daize, Zhao, Xinyu, Peng, Jie, Cheng, Yu, Chen, Tianlong
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2407.11030
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Probing to Refine: Reinforcement Distillation of LLMs via Explanatory Inversion
by: Tan, Zhen, et al.
Published: (2026)

Can GRPO Help LLMs Transcend Their Pretraining Origin?
by: Ni, Kangqi, et al.
Published: (2025)

QuantMoE-Bench: Examining Post-Training Quantization for Mixture-of-Experts
by: Li, Pingzhi, et al.
Published: (2024)

Leave It to the Experts: Detecting Knowledge Distillation via MoE Expert Signatures
by: Li, Pingzhi, et al.
Published: (2025)

DOGe: Defensive Output Generation for LLM Protection Against Knowledge Distillation
by: Li, Pingzhi, et al.
Published: (2025)

Towards Efficient Mixture of Experts: A Holistic Study of Compression Techniques
by: He, Shwai, et al.
Published: (2024)

DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing
by: Ma, Xinyu, et al.
Published: (2025)

Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
by: Li, Pingzhi, et al.
Published: (2023)

Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
by: Zhao, Xinyu, et al.
Published: (2024)

Dr.LLM: Dynamic Layer Routing in LLMs
by: Heakl, Ahmed, et al.
Published: (2025)

GraphRCG: Self-Conditioned Graph Generation
by: Wang, Song, et al.
Published: (2024)

$\textit{Agents Under Siege}$: Breaking Pragmatic Multi-Agent LLM Systems with Optimized Prompt Attacks
by: Khan, Rana Muhammad Shahroz, et al.
Published: (2025)

UProp: Investigating the Uncertainty Propagation of LLMs in Multi-Step Agentic Decision-Making
by: Duan, Jinhao, et al.
Published: (2025)

From Pruning to Grafting: Dynamic Knowledge Redistribution via Learnable Layer Fusion
by: Pei, Zehua, et al.
Published: (2024)

LESA: Learnable LLM Layer Scaling-Up
by: Yang, Yifei, et al.
Published: (2025)

Not All Layers of LLMs Are Necessary During Inference
by: Fan, Siqi, et al.
Published: (2024)

Finding the Cracks: Improving LLMs Reasoning with Paraphrastic Probing and Consistency Verification
by: Shi, Weili, et al.
Published: (2026)

Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach
by: Tan, Zhen, et al.
Published: (2024)

Measuring Real-World Prompt Injection Attacks in LLM-based Resume Screening
by: Zhang, Mohan, et al.
Published: (2026)

DynScaling: Efficient Verifier-free Inference Scaling via Dynamic and Integrated Sampling
by: Wang, Fei, et al.
Published: (2025)

DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
by: Yue, Murong, et al.
Published: (2024)

Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert Parallelism Design
by: Zhang, Mohan, et al.
Published: (2025)

How to Train Data-Efficient LLMs
by: Sachdeva, Noveen, et al.
Published: (2024)

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
by: Zhao, Chengshuai, et al.
Published: (2025)

GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations
by: Duan, Jinhao, et al.
Published: (2024)

R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning
by: Chen, Zhuokun, et al.
Published: (2025)

KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache Sharing
by: Yang, Yifei, et al.
Published: (2024)

PoTPTQ: A Two-step Power-of-Two Post-training for LLMs
by: Wang, Xinyu, et al.
Published: (2025)

Steer2Adapt: Dynamically Composing Steering Vectors Elicits Efficient Adaptation of LLMs
by: Han, Pengrui, et al.
Published: (2026)

Towards a Comprehensive Scaling Law of Mixture-of-Experts
by: Zhao, Guoliang, et al.
Published: (2025)

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders
by: Hu, Yuezhou, et al.
Published: (2025)

The Wolf Within: Covert Injection of Malice into MLLM Societies via an MLLM Operative
by: Tan, Zhen, et al.
Published: (2024)

PiCO: Peer Review in LLMs based on the Consistency Optimization
by: Ning, Kun-Peng, et al.
Published: (2024)

Sample-Efficient Alignment for LLMs
by: Liu, Zichen, et al.
Published: (2024)

Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
by: Tan, Shawn, et al.
Published: (2024)

OPT-Engine: Benchmarking the Limits of LLMs in Optimization Modeling via Complexity Scaling
by: Chen, Yitian, et al.
Published: (2026)

A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
by: Gao, Zhangyang, et al.
Published: (2024)

AutoScale: Scale-Aware Data Mixing for Pre-Training LLMs
by: Kang, Feiyang, et al.
Published: (2024)

On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion
by: Fan, Chenghao, et al.
Published: (2024)

Beyond Redundancy: Diverse and Specialized Multi-Expert Sparse Autoencoder
by: Xu, Zhen, et al.
Published: (2025)