Saved in:
| Main Authors: | Yang, Ge, He, Changyi, Guo, Jinyang, Wu, Jianyu, Ding, Yifu, Liu, Aishan, Qin, Haotong, Ji, Pengliang, Liu, Xianglong |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.21352 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms
by: Gong, Ruihao, et al.
Published: (2024)
by: Gong, Ruihao, et al.
Published: (2024)
BWTA: Accurate and Efficient Binarized Transformer by Algorithm-Hardware Co-design
by: Ding, Yifu, et al.
Published: (2026)
by: Ding, Yifu, et al.
Published: (2026)
SLMQuant:Benchmarking Small Language Model Quantization for Practical Deployment
by: Wang, Jiacheng, et al.
Published: (2025)
by: Wang, Jiacheng, et al.
Published: (2025)
DB-LLM: Accurate Dual-Binarization for Efficient LLMs
by: Chen, Hong, et al.
Published: (2024)
by: Chen, Hong, et al.
Published: (2024)
First-Order Error Matters: Accurate Compensation for Quantized Large Language Models
by: Zheng, Xingyu, et al.
Published: (2025)
by: Zheng, Xingyu, et al.
Published: (2025)
MoDES: Accelerating Mixture-of-Experts Multimodal Large Language Models via Dynamic Expert Skipping
by: Huang, Yushi, et al.
Published: (2025)
by: Huang, Yushi, et al.
Published: (2025)
Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing
by: Xiao, Yisong, et al.
Published: (2025)
by: Xiao, Yisong, et al.
Published: (2025)
CodeSimpleQA: Scaling Factuality in Code Large Language Models
by: Yang, Jian, et al.
Published: (2025)
by: Yang, Jian, et al.
Published: (2025)
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit
by: Gong, Ruihao, et al.
Published: (2024)
by: Gong, Ruihao, et al.
Published: (2024)
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
by: Huang, Wei, et al.
Published: (2024)
by: Huang, Wei, et al.
Published: (2024)
PTQ4SAM: Post-Training Quantization for Segment Anything
by: Lv, Chengtao, et al.
Published: (2024)
by: Lv, Chengtao, et al.
Published: (2024)
M2G-Eval: Enhancing and Evaluating Multi-granularity Multilingual Code Generation
by: Xu, Fanglin, et al.
Published: (2025)
by: Xu, Fanglin, et al.
Published: (2025)
PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models
by: Wnag, Zining, et al.
Published: (2024)
by: Wnag, Zining, et al.
Published: (2024)
How Emotion Shapes the Behavior of LLMs and Agents: A Mechanistic Study
by: Sun, Moran, et al.
Published: (2026)
by: Sun, Moran, et al.
Published: (2026)
SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models
by: Ying, Zonghao, et al.
Published: (2024)
by: Ying, Zonghao, et al.
Published: (2024)
Context as a Tool: Context Management for Long-Horizon SWE-Agents
by: Liu, Shukai, et al.
Published: (2025)
by: Liu, Shukai, et al.
Published: (2025)
Reasoning-Augmented Conversation for Multi-Turn Jailbreak Attacks on Large Language Models
by: Ying, Zonghao, et al.
Published: (2025)
by: Ying, Zonghao, et al.
Published: (2025)
Scaling Laws for Code: Every Programming Language Matters
by: Yang, Jian, et al.
Published: (2025)
by: Yang, Jian, et al.
Published: (2025)
OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models
by: Ding, Yue, et al.
Published: (2026)
by: Ding, Yue, et al.
Published: (2026)
Large Language Models Still Face Challenges in Multi-Hop Reasoning with External Knowledge
by: Zhang, Haotong
Published: (2024)
by: Zhang, Haotong
Published: (2024)
DDK: Distilling Domain Knowledge for Efficient Large Language Models
by: Liu, Jiaheng, et al.
Published: (2024)
by: Liu, Jiaheng, et al.
Published: (2024)
ARB-LLM: Alternating Refined Binarizations for Large Language Models
by: Li, Zhiteng, et al.
Published: (2024)
by: Li, Zhiteng, et al.
Published: (2024)
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
by: Qin, Haotong, et al.
Published: (2024)
by: Qin, Haotong, et al.
Published: (2024)
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
by: Huang, Wei, et al.
Published: (2024)
by: Huang, Wei, et al.
Published: (2024)
Benchmarking Gaslighting Attacks Against Speech Large Language Models
by: Wu, Jinyang, et al.
Published: (2025)
by: Wu, Jinyang, et al.
Published: (2025)
Diagonal-Tiled Mixed-Precision Attention for Efficient Low-Bit MXFP Inference
by: Ding, Yifu, et al.
Published: (2026)
by: Ding, Yifu, et al.
Published: (2026)
QuantSR+: Pushing the Limit of Quantized Image Super-Resolution Networks
by: Qin, Haotong, et al.
Published: (2026)
by: Qin, Haotong, et al.
Published: (2026)
Decoding by Contrasting Knowledge: Enhancing LLMs' Confidence on Edited Facts
by: Bi, Baolong, et al.
Published: (2024)
by: Bi, Baolong, et al.
Published: (2024)
Fairness Mediator: Neutralize Stereotype Associations to Mitigate Bias in Large Language Models
by: Xiao, Yisong, et al.
Published: (2025)
by: Xiao, Yisong, et al.
Published: (2025)
StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
by: Guo, Zhicheng, et al.
Published: (2024)
by: Guo, Zhicheng, et al.
Published: (2024)
MoMQ: Mixture-of-Experts Enhances Multi-Dialect Query Generation across Relational and Non-Relational Databases
by: Lin, Zhisheng, et al.
Published: (2024)
by: Lin, Zhisheng, et al.
Published: (2024)
AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions
by: Ying, Zonghao, et al.
Published: (2025)
by: Ying, Zonghao, et al.
Published: (2025)
Exploring Semantic-constrained Adversarial Example with Instruction Uncertainty Reduction
by: Hu, Jin, et al.
Published: (2025)
by: Hu, Jin, et al.
Published: (2025)
LSAQ: Layer-Specific Adaptive Quantization for Large Language Model Deployment
by: Zeng, Binrui, et al.
Published: (2024)
by: Zeng, Binrui, et al.
Published: (2024)
Deploying Multi-task Online Server with Large Language Model
by: Qu, Yincen, et al.
Published: (2024)
by: Qu, Yincen, et al.
Published: (2024)
BiDM: Pushing the Limit of Quantization for Diffusion Models
by: Zheng, Xingyu, et al.
Published: (2024)
by: Zheng, Xingyu, et al.
Published: (2024)
UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models
by: Wu, Jiajun, et al.
Published: (2025)
by: Wu, Jiajun, et al.
Published: (2025)
Parrot Mind: Towards Explaining the Complex Task Reasoning of Pretrained Large Language Models with Template-Content Structure
by: Yang, Haotong, et al.
Published: (2023)
by: Yang, Haotong, et al.
Published: (2023)
Towards Understanding the Safety Boundaries of DeepSeek Models: Evaluation and Findings
by: Ying, Zonghao, et al.
Published: (2025)
by: Ying, Zonghao, et al.
Published: (2025)
EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning
by: Chen, Yi, et al.
Published: (2023)
by: Chen, Yi, et al.
Published: (2023)
Similar Items
-
A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms
by: Gong, Ruihao, et al.
Published: (2024) -
BWTA: Accurate and Efficient Binarized Transformer by Algorithm-Hardware Co-design
by: Ding, Yifu, et al.
Published: (2026) -
SLMQuant:Benchmarking Small Language Model Quantization for Practical Deployment
by: Wang, Jiacheng, et al.
Published: (2025) -
DB-LLM: Accurate Dual-Binarization for Efficient LLMs
by: Chen, Hong, et al.
Published: (2024) -
First-Order Error Matters: Accurate Compensation for Quantized Large Language Models
by: Zheng, Xingyu, et al.
Published: (2025)