Saved in:
| Main Authors: | Yoon, Junho, Lee, Geom, Jeon, Donghyeon, Kang, Inho, Na, Seung-Hoon |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.13472 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
QUPID: Quantified Understanding for Enhanced Performance, Insights, and Decisions in Korean Search Engines
by: Kwon, Ohjoon, et al.
Published: (2025)
by: Kwon, Ohjoon, et al.
Published: (2025)
SLM as Guardian: Pioneering AI Safety with Small Language Models
by: Kwon, Ohjoon, et al.
Published: (2024)
by: Kwon, Ohjoon, et al.
Published: (2024)
AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models
by: Lee, Sangjun, et al.
Published: (2025)
by: Lee, Sangjun, et al.
Published: (2025)
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
by: Park, Yein, et al.
Published: (2024)
by: Park, Yein, et al.
Published: (2024)
GWQ: Gradient-Aware Weight Quantization for Large Language Models
by: Shao, Yihua, et al.
Published: (2024)
by: Shao, Yihua, et al.
Published: (2024)
Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
by: Park, Jungwoo, et al.
Published: (2025)
by: Park, Jungwoo, et al.
Published: (2025)
CacheFocus: Dynamic Cache Re-Positioning for Efficient Retrieval-Augmented Generation
by: Lee, Kun-Hui, et al.
Published: (2025)
by: Lee, Kun-Hui, et al.
Published: (2025)
On the Compressibility of Quantized Large Language Models
by: Mao, Yu, et al.
Published: (2024)
by: Mao, Yu, et al.
Published: (2024)
From Relevance to Authority: Authority-aware Generative Retrieval in Web Search Engines
by: Lee, Sunkyung, et al.
Published: (2026)
by: Lee, Sunkyung, et al.
Published: (2026)
QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models
by: Jeon, Hyesung, et al.
Published: (2025)
by: Jeon, Hyesung, et al.
Published: (2025)
SASQ: Static Activation Scaling for Quantization-Aware Training in Large Language Models
by: Mao, Shizhuo, et al.
Published: (2025)
by: Mao, Shizhuo, et al.
Published: (2025)
Enhancing Robustness of Retrieval-Augmented Language Models with In-Context Learning
by: Park, Seong-Il, et al.
Published: (2024)
by: Park, Seong-Il, et al.
Published: (2024)
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
by: Chen, Mengzhao, et al.
Published: (2024)
by: Chen, Mengzhao, et al.
Published: (2024)
Weights-Rotated Preference Optimization for Large Language Models
by: Yang, Chenxu, et al.
Published: (2025)
by: Yang, Chenxu, et al.
Published: (2025)
Language-Agnostic Suicidal Risk Detection Using Large Language Models
by: Kim, June-Woo, et al.
Published: (2025)
by: Kim, June-Woo, et al.
Published: (2025)
Aligning Large Language Models for Enhancing Psychiatric Interviews Through Symptom Delineation and Summarization: Pilot Study
by: So, Jae-hee, et al.
Published: (2024)
by: So, Jae-hee, et al.
Published: (2024)
Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
by: Cho, Yoonjun, et al.
Published: (2025)
by: Cho, Yoonjun, et al.
Published: (2025)
Refining Salience-Aware Sparse Fine-Tuning Strategies for Language Models
by: Liu, Xinxin, et al.
Published: (2024)
by: Liu, Xinxin, et al.
Published: (2024)
Mentor-KD: Making Small Language Models Better Multi-step Reasoners
by: Lee, Hojae, et al.
Published: (2024)
by: Lee, Hojae, et al.
Published: (2024)
Quantization-Aware and Tensor-Compressed Training of Transformers for Natural Language Understanding
by: Yang, Zi, et al.
Published: (2023)
by: Yang, Zi, et al.
Published: (2023)
Saliency-Aware Regularized Quantization Calibration for Large Language Models
by: Zhao, Yanlong, et al.
Published: (2026)
by: Zhao, Yanlong, et al.
Published: (2026)
Saliency-driven Dynamic Token Pruning for Large Language Models
by: Tao, Yao, et al.
Published: (2025)
by: Tao, Yao, et al.
Published: (2025)
Knowledge Synthesis of Photosynthesis Research Using a Large Language Model
by: Yoon, Seungri, et al.
Published: (2025)
by: Yoon, Seungri, et al.
Published: (2025)
ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
by: Lee, Taewhoo, et al.
Published: (2024)
by: Lee, Taewhoo, et al.
Published: (2024)
DICE-BENCH: Evaluating the Tool-Use Capabilities of Large Language Models in Multi-Round, Multi-Party Dialogues
by: Jang, Kyochul, et al.
Published: (2025)
by: Jang, Kyochul, et al.
Published: (2025)
OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models
by: Lee, Changhun, et al.
Published: (2023)
by: Lee, Changhun, et al.
Published: (2023)
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models
by: Wang, Weilan, et al.
Published: (2025)
by: Wang, Weilan, et al.
Published: (2025)
What Models Know, How Well They Know It: Knowledge-Weighted Fine-Tuning for Learning When to Say "I Don't Know"
by: Lee, Joosung, et al.
Published: (2026)
by: Lee, Joosung, et al.
Published: (2026)
BASE-Q: Bias and Asymmetric Scaling Enhanced Rotational Quantization for Large Language Models
by: He, Liulu, et al.
Published: (2025)
by: He, Liulu, et al.
Published: (2025)
DLLMQuant: Quantizing Diffusion-based Large Language Models
by: Xu, Chen, et al.
Published: (2025)
by: Xu, Chen, et al.
Published: (2025)
Uncertainty-Aware Adaptation of Large Language Models for Protein-Protein Interaction Analysis
by: Jantre, Sanket, et al.
Published: (2025)
by: Jantre, Sanket, et al.
Published: (2025)
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit
by: Gong, Ruihao, et al.
Published: (2024)
by: Gong, Ruihao, et al.
Published: (2024)
EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation
by: Hwang, Taeho, et al.
Published: (2024)
by: Hwang, Taeho, et al.
Published: (2024)
PCEval: A Benchmark for Evaluating Physical Computing Capabilities of Large Language Models
by: Song, Inpyo, et al.
Published: (2025)
by: Song, Inpyo, et al.
Published: (2025)
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models
by: Jeon, Hyesung, et al.
Published: (2024)
by: Jeon, Hyesung, et al.
Published: (2024)
A Multi-faceted Analysis of Cognitive Abilities: Evaluating Prompt Methods with Large Language Models on the CONSORT Checklist
by: Jeon, Sohyeon, et al.
Published: (2025)
by: Jeon, Sohyeon, et al.
Published: (2025)
SiLQ: Simple Large Language Model Quantization-Aware Training
by: Esser, Steven K., et al.
Published: (2025)
by: Esser, Steven K., et al.
Published: (2025)
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More
by: Yue, Yuxuan, et al.
Published: (2024)
by: Yue, Yuxuan, et al.
Published: (2024)
Dynamic Compressing Prompts for Efficient Inference of Large Language Models
by: Hu, Jinwu, et al.
Published: (2025)
by: Hu, Jinwu, et al.
Published: (2025)
LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment
by: Yang, Ge, et al.
Published: (2024)
by: Yang, Ge, et al.
Published: (2024)
Similar Items
-
QUPID: Quantified Understanding for Enhanced Performance, Insights, and Decisions in Korean Search Engines
by: Kwon, Ohjoon, et al.
Published: (2025) -
SLM as Guardian: Pioneering AI Safety with Small Language Models
by: Kwon, Ohjoon, et al.
Published: (2024) -
AMQ: Enabling AutoML for Mixed-precision Weight-Only Quantization of Large Language Models
by: Lee, Sangjun, et al.
Published: (2025) -
ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains
by: Park, Yein, et al.
Published: (2024) -
GWQ: Gradient-Aware Weight Quantization for Large Language Models
by: Shao, Yihua, et al.
Published: (2024)