:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Jaewoo, Kim, Hayun, Kim, Younghoon
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2405.14428
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ExpertWeaver: Unlocking the Inherent MoE in Dense LLMs with GLU Activation Patterns
by: Zhao, Ziyu, et al.
Published: (2026)

Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation Regularization
by: Nrusimha, Aniruddha, et al.
Published: (2024)

Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization
by: Son, Seungwoo, et al.
Published: (2024)

GLU Attention Improve Transformer
by: Wang, Zehao
Published: (2025)

LQER: Low-Rank Quantization Error Reconstruction for LLMs
by: Zhang, Cheng, et al.
Published: (2024)

GLUScope: A Tool for Analyzing GLU Neurons in Transformer Language Models
by: Gerstner, Sebastian, et al.
Published: (2026)

Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates
by: Ahn, Jaewoo, et al.
Published: (2025)

Dual Path Attribution: Efficient Attribution for SwiGLU-Transformers through Layer-Wise Target Propagation
by: Jantsch, Lasse Marten, et al.
Published: (2026)

Dependency-Aware Semi-Structured Sparsity of GLU Variants in Large Language Models
by: Guo, Zhiyu, et al.
Published: (2024)

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge
by: Shen, Xuan, et al.
Published: (2023)

Enhancing Delta Compression in LLMs via SVD-based Quantization Error Minimization
by: Xiong, Boya, et al.
Published: (2025)

ZClip: Adaptive Spike Mitigation for LLM Pre-Training
by: Kumar, Abhay, et al.
Published: (2025)

Fast Matrix Multiplications for Lookup Table-Quantized LLMs
by: Guo, Han, et al.
Published: (2024)

Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
by: Park, Jungwoo, et al.
Published: (2025)

RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
by: Huang, Xijie, et al.
Published: (2024)

L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models
by: Jeon, Hyesung, et al.
Published: (2024)

Learning to Correct for QA Reasoning with Black-box LLMs
by: Kim, Jaehyung, et al.
Published: (2024)

Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
by: Ouyang, Xu, et al.
Published: (2024)

Depth Registers Unlock W4A4 on SwiGLU: A Reader/Generator Decomposition
by: Liu, Ziyang
Published: (2026)

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?
by: Kim, Jeonghye, et al.
Published: (2026)

SLEB: Streamlining LLMs through Redundancy Verification and Elimination of Transformer Blocks
by: Song, Jiwon, et al.
Published: (2024)

QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference
by: Kim, Taesu, et al.
Published: (2024)

A More Accurate Approximation of Activation Function with Few Spikes Neurons
by: Jeong, Dayena, et al.
Published: (2024)

On the Importance of a Multi-Scale Calibration for Quantization
by: Son, Seungwoo, et al.
Published: (2026)

QEFT: Quantization for Efficient Fine-Tuning of LLMs
by: Lee, Changhun, et al.
Published: (2024)

How Does Quantization Affect Multilingual LLMs?
by: Marchisio, Kelly, et al.
Published: (2024)

LittleBit: Ultra Low-Bit Quantization via Latent Factorization
by: Lee, Banseok, et al.
Published: (2025)

Turning LLM Activations Quantization-Friendly
by: Czakó, Patrik, et al.
Published: (2025)

Few-shot Personalization of LLMs with Mis-aligned Responses
by: Kim, Jaehyung, et al.
Published: (2024)

Mitigating Bias in RAG: Controlling the Embedder
by: Kim, Taeyoun, et al.
Published: (2025)

Culinary Class Wars: Evaluating LLMs using ASH in Cuisine Transfer Task
by: Lee, Hoonick, et al.
Published: (2024)

Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
by: Qin, Haotong, et al.
Published: (2024)

Subgraph-level Universal Prompt Tuning
by: Lee, Junhyun, et al.
Published: (2024)

Interpreting and Mitigating Unwanted Uncertainty in LLMs
by: Roy, Tiasa Singha, et al.
Published: (2025)

Interpreting the Effects of Quantization on LLMs
by: Singh, Manpreet, et al.
Published: (2025)

GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs
by: Deng, Jianing, et al.
Published: (2026)

Rethinking State Tracking in Recurrent Models Through Error Control Dynamics
by: Chung, Jiwan, et al.
Published: (2026)

SqueezeLLM: Dense-and-Sparse Quantization
by: Kim, Sehoon, et al.
Published: (2023)

Breaking the Mirror: Activation-Based Mitigation of Self-Preference in LLM Evaluators
by: Roytburg, Dani, et al.
Published: (2025)

AAAC: Activation-Aware Adaptive Codebooks for 4-bit LLM Weight Quantization
by: IslamBouli, Beshr, et al.
Published: (2026)