:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xu, Chi, Zhang, Gefei, Zhu, Yantong, Benini, Luca, Hu, Guosheng, Li, Yawei, Zhang, Zhihong
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2503.11164
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Optimal Brain Restoration for Joint Quantization and Sparsification of LLMs
by: Guo, Hang, et al.
Published: (2025)

PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
by: Zheng, Miao, et al.
Published: (2024)

Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules
by: Zhang, Yueqi, et al.
Published: (2025)

Revisiting Adaptive Rounding with Vectorized Reparameterization for LLM Quantization
by: Zhou, Yuli, et al.
Published: (2026)

Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models
by: Li, Miaoran, et al.
Published: (2023)

Decoupled Alignment for Robust Plug-and-Play Adaptation
by: Luo, Haozheng, et al.
Published: (2024)

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
by: Huang, Wei, et al.
Published: (2024)

Variator: Accelerating Pre-trained Models with Plug-and-Play Compression Modules
by: Xiao, Chaojun, et al.
Published: (2023)

Plug-and-Play Training Framework for Preference Optimization
by: Ma, Jingyuan, et al.
Published: (2024)

One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models
by: Shao, Hang, et al.
Published: (2023)

Harnessing the Plug-and-Play Controller by Prompting
by: Wang, Hao, et al.
Published: (2024)

DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check
by: Qiao, Ziheng, et al.
Published: (2024)

LLMs + Persona-Plug = Personalized LLMs
by: Liu, Jiongnan, et al.
Published: (2024)

Revisiting Large Language Model Pruning using Neuron Semantic Attribution
by: Ding, Yizhuo, et al.
Published: (2025)

Sparsity Induction for Accurate Post-Training Pruning of Large Language Models
by: Jiang, Minhao, et al.
Published: (2026)

Generation-Augmented Generation: A Plug-and-Play Framework for Private Knowledge Injection in Large Language Models
by: Li, Rongji, et al.
Published: (2026)

AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems
by: Chan, Chi-Min, et al.
Published: (2024)

Universal Reasoner: A Single, Composable Plug-and-Play Reasoner for Frozen LLMs
by: Kim, Jaemin, et al.
Published: (2025)

360-LLaMA-Factory: Plug & Play Sequence Parallelism for Long Post-Training
by: Zou, Haosheng, et al.
Published: (2025)

X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
by: Xu, Haoran, et al.
Published: (2024)

Large Language Models Enhanced by Plug and Play Syntactic Knowledge for Aspect-based Sentiment Analysis
by: Tian, Yuanhe, et al.
Published: (2025)

Plug-and-Play Grounding of Reasoning in Multimodal Large Language Models
by: Chen, Jiaxing, et al.
Published: (2024)

Prune&Comp: Free Lunch for Layer-Pruned LLMs via Iterative Pruning with Magnitude Compensation
by: Chen, Xinrui, et al.
Published: (2025)

Mask the Target: A Plug-and-Play Regularizer Against LoRA Forgetting
by: Xu, Runze, et al.
Published: (2026)

BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation
by: Xu, Peng, et al.
Published: (2024)

Plug-and-Play Policy Planner for Large Language Model Powered Dialogue Agents
by: Deng, Yang, et al.
Published: (2023)

Sirius: Contextual Sparsity with Correction for Efficient LLMs
by: Zhou, Yang, et al.
Published: (2024)

A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation
by: Li, Xiangci, et al.
Published: (2024)

Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs
by: Wang, Yibo, et al.
Published: (2026)

Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity
by: Zhang, Di, et al.
Published: (2026)

PCToolkit: A Unified Plug-and-Play Prompt Compression Toolkit of Large Language Models
by: Li, Jinyi, et al.
Published: (2024)

Plug and Play with Prompts: A Prompt Tuning Approach for Controlling Text Generation
by: Ajwani, Rohan Deepak, et al.
Published: (2024)

Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning
by: Cheng, Xuxin, et al.
Published: (2024)

Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
by: Wu, Yuheng, et al.
Published: (2025)

TRIM: Achieving Extreme Sparsity with Targeted Row-wise Iterative Metric-driven Pruning
by: Beck, Florentin, et al.
Published: (2025)

Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
by: Lin, Chaofan, et al.
Published: (2025)

FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning
by: Guo, Hang, et al.
Published: (2025)

UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion Cause
by: Hu, Guimin, et al.
Published: (2024)

SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning
by: Liao, Huanxuan, et al.
Published: (2025)

ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning
by: Hou, Bairu, et al.
Published: (2025)