Saved in:
| Main Authors: | Xu, Chi, Zhang, Gefei, Zhu, Yantong, Benini, Luca, Hu, Guosheng, Li, Yawei, Zhang, Zhihong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.11164 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs
by: Guo, Hang, et al.
Published: (2025)
by: Guo, Hang, et al.
Published: (2025)
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
by: Zheng, Miao, et al.
Published: (2024)
by: Zheng, Miao, et al.
Published: (2024)
Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules
by: Zhang, Yueqi, et al.
Published: (2025)
by: Zhang, Yueqi, et al.
Published: (2025)
Revisiting Adaptive Rounding with Vectorized Reparameterization for LLM Quantization
by: Zhou, Yuli, et al.
Published: (2026)
by: Zhou, Yuli, et al.
Published: (2026)
Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models
by: Li, Miaoran, et al.
Published: (2023)
by: Li, Miaoran, et al.
Published: (2023)
Decoupled Alignment for Robust Plug-and-Play Adaptation
by: Luo, Haozheng, et al.
Published: (2024)
by: Luo, Haozheng, et al.
Published: (2024)
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
by: Huang, Wei, et al.
Published: (2024)
by: Huang, Wei, et al.
Published: (2024)
Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules
by: Xiao, Chaojun, et al.
Published: (2023)
by: Xiao, Chaojun, et al.
Published: (2023)
Plug-and-Play Training Framework for Preference Optimization
by: Ma, Jingyuan, et al.
Published: (2024)
by: Ma, Jingyuan, et al.
Published: (2024)
One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models
by: Shao, Hang, et al.
Published: (2023)
by: Shao, Hang, et al.
Published: (2023)
Harnessing the Plug-and-Play Controller by Prompting
by: Wang, Hao, et al.
Published: (2024)
by: Wang, Hao, et al.
Published: (2024)
DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check
by: Qiao, Ziheng, et al.
Published: (2024)
by: Qiao, Ziheng, et al.
Published: (2024)
LLMs + Persona-Plug = Personalized LLMs
by: Liu, Jiongnan, et al.
Published: (2024)
by: Liu, Jiongnan, et al.
Published: (2024)
Revisiting Large Language Model Pruning using Neuron Semantic Attribution
by: Ding, Yizhuo, et al.
Published: (2025)
by: Ding, Yizhuo, et al.
Published: (2025)
Sparsity Induction for Accurate Post-Training Pruning of Large Language Models
by: Jiang, Minhao, et al.
Published: (2026)
by: Jiang, Minhao, et al.
Published: (2026)
Generation-Augmented Generation: A Plug-and-Play Framework for Private Knowledge Injection in Large Language Models
by: Li, Rongji, et al.
Published: (2026)
by: Li, Rongji, et al.
Published: (2026)
AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems
by: Chan, Chi-Min, et al.
Published: (2024)
by: Chan, Chi-Min, et al.
Published: (2024)
Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs
by: Kim, Jaemin, et al.
Published: (2025)
by: Kim, Jaemin, et al.
Published: (2025)
360-LLaMA-Factory: Plug & Play Sequence Parallelism for Long Post-Training
by: Zou, Haosheng, et al.
Published: (2025)
by: Zou, Haosheng, et al.
Published: (2025)
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
by: Xu, Haoran, et al.
Published: (2024)
by: Xu, Haoran, et al.
Published: (2024)
Large Language Models Enhanced by Plug and Play Syntactic Knowledge for Aspect-based Sentiment Analysis
by: Tian, Yuanhe, et al.
Published: (2025)
by: Tian, Yuanhe, et al.
Published: (2025)
Plug-and-Play Grounding of Reasoning in Multimodal Large Language Models
by: Chen, Jiaxing, et al.
Published: (2024)
by: Chen, Jiaxing, et al.
Published: (2024)
Prune&Comp: Free Lunch for Layer-Pruned LLMs via Iterative Pruning with Magnitude Compensation
by: Chen, Xinrui, et al.
Published: (2025)
by: Chen, Xinrui, et al.
Published: (2025)
Mask the Target: A Plug-and-Play Regularizer Against LoRA Forgetting
by: Xu, Runze, et al.
Published: (2026)
by: Xu, Runze, et al.
Published: (2026)
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation
by: Xu, Peng, et al.
Published: (2024)
by: Xu, Peng, et al.
Published: (2024)
Plug-and-Play Policy Planner for Large Language Model Powered Dialogue Agents
by: Deng, Yang, et al.
Published: (2023)
by: Deng, Yang, et al.
Published: (2023)
Sirius: Contextual Sparsity with Correction for Efficient LLMs
by: Zhou, Yang, et al.
Published: (2024)
by: Zhou, Yang, et al.
Published: (2024)
A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation
by: Li, Xiangci, et al.
Published: (2024)
by: Li, Xiangci, et al.
Published: (2024)
Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs
by: Wang, Yibo, et al.
Published: (2026)
by: Wang, Yibo, et al.
Published: (2026)
Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity
by: Zhang, Di, et al.
Published: (2026)
by: Zhang, Di, et al.
Published: (2026)
PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models
by: Li, Jinyi, et al.
Published: (2024)
by: Li, Jinyi, et al.
Published: (2024)
Plug and Play with Prompts: A Prompt Tuning Approach for Controlling Text Generation
by: Ajwani, Rohan Deepak, et al.
Published: (2024)
by: Ajwani, Rohan Deepak, et al.
Published: (2024)
Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning
by: Cheng, Xuxin, et al.
Published: (2024)
by: Cheng, Xuxin, et al.
Published: (2024)
Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
by: Wu, Yuheng, et al.
Published: (2025)
by: Wu, Yuheng, et al.
Published: (2025)
TRIM: Achieving Extreme Sparsity with Targeted Row-wise Iterative Metric-driven Pruning
by: Beck, Florentin, et al.
Published: (2025)
by: Beck, Florentin, et al.
Published: (2025)
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
by: Lin, Chaofan, et al.
Published: (2025)
by: Lin, Chaofan, et al.
Published: (2025)
FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
by: Guo, Hang, et al.
Published: (2025)
by: Guo, Hang, et al.
Published: (2025)
UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion Cause
by: Hu, Guimin, et al.
Published: (2024)
by: Hu, Guimin, et al.
Published: (2024)
SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning
by: Liao, Huanxuan, et al.
Published: (2025)
by: Liao, Huanxuan, et al.
Published: (2025)
ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning
by: Hou, Bairu, et al.
Published: (2025)
by: Hou, Bairu, et al.
Published: (2025)
Similar Items
-
Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs
by: Guo, Hang, et al.
Published: (2025) -
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
by: Zheng, Miao, et al.
Published: (2024) -
Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules
by: Zhang, Yueqi, et al.
Published: (2025) -
Revisiting Adaptive Rounding with Vectorized Reparameterization for LLM Quantization
by: Zhou, Yuli, et al.
Published: (2026) -
Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models
by: Li, Miaoran, et al.
Published: (2023)