Saved in:
| Main Authors: | Miao, Ruijie, Yan, Yihan, Yao, Xinshuo, Yang, Tong |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.20272 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Delta Knowledge Distillation for Large Language Models
by: Cao, Yihan, et al.
Published: (2025)
by: Cao, Yihan, et al.
Published: (2025)
Scaling Inference-Efficient Language Models
by: Bian, Song, et al.
Published: (2025)
by: Bian, Song, et al.
Published: (2025)
Benchmarking Benchmark Leakage in Large Language Models
by: Xu, Ruijie, et al.
Published: (2024)
by: Xu, Ruijie, et al.
Published: (2024)
KDFlow: A User-Friendly and Efficient Knowledge Distillation Framework for Large Language Models
by: Zhang, Songming, et al.
Published: (2026)
by: Zhang, Songming, et al.
Published: (2026)
Instruction Mining: Instruction Data Selection for Tuning Large Language Models
by: Cao, Yihan, et al.
Published: (2023)
by: Cao, Yihan, et al.
Published: (2023)
CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing
by: Zheng, Wenhao, et al.
Published: (2025)
by: Zheng, Wenhao, et al.
Published: (2025)
OpenELM: An Efficient Language Model Family with Open Training and Inference Framework
by: Mehta, Sachin, et al.
Published: (2024)
by: Mehta, Sachin, et al.
Published: (2024)
Large Language Model Unlearning
by: Yao, Yuanshun, et al.
Published: (2023)
by: Yao, Yuanshun, et al.
Published: (2023)
Athena: Efficient Block-Wise Post-Training Quantization for Large Language Models Using Second-Order Matrix Derivative Information
by: Wang, Yanshu, et al.
Published: (2024)
by: Wang, Yanshu, et al.
Published: (2024)
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
by: Alizadeh, Keivan, et al.
Published: (2023)
by: Alizadeh, Keivan, et al.
Published: (2023)
Model Compression and Efficient Inference for Large Language Models: A Survey
by: Wang, Wenxiao, et al.
Published: (2024)
by: Wang, Wenxiao, et al.
Published: (2024)
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
by: Hu, Xing, et al.
Published: (2024)
by: Hu, Xing, et al.
Published: (2024)
Pretraining Data Exposure in Large Language Models: A Survey of Membership Inference, Data Contamination, and Security Implications
by: Tong, Ziyi, et al.
Published: (2026)
by: Tong, Ziyi, et al.
Published: (2026)
Foundations of Large Language Models
by: Xiao, Tong, et al.
Published: (2025)
by: Xiao, Tong, et al.
Published: (2025)
Rethinking Data Mixing from the Perspective of Large Language Models
by: Xu, Yuanjian, et al.
Published: (2026)
by: Xu, Yuanjian, et al.
Published: (2026)
Sliding Window Attention Training for Efficient Large Language Models
by: Fu, Zichuan, et al.
Published: (2025)
by: Fu, Zichuan, et al.
Published: (2025)
An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
by: Bhatt, Gantavya, et al.
Published: (2024)
by: Bhatt, Gantavya, et al.
Published: (2024)
FineSteer: A Unified Framework for Fine-Grained Inference-Time Steering in Large Language Models
by: Weng, Zixuan, et al.
Published: (2026)
by: Weng, Zixuan, et al.
Published: (2026)
PAT: Pruning-Aware Tuning for Large Language Models
by: Liu, Yijiang, et al.
Published: (2024)
by: Liu, Yijiang, et al.
Published: (2024)
A Systematic Survey on Large Language Models for Algorithm Design
by: Liu, Fei, et al.
Published: (2024)
by: Liu, Fei, et al.
Published: (2024)
Efficient Non-Parametric Uncertainty Quantification for Black-Box Large Language Models and Decision Planning
by: Tsai, Yao-Hung Hubert, et al.
Published: (2024)
by: Tsai, Yao-Hung Hubert, et al.
Published: (2024)
Upcycling Large Language Models into Mixture of Experts
by: He, Ethan, et al.
Published: (2024)
by: He, Ethan, et al.
Published: (2024)
ME-Switch: A Memory-Efficient Expert Switching Framework for Large Language Models
by: Liu, Jing, et al.
Published: (2024)
by: Liu, Jing, et al.
Published: (2024)
CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
by: Dai, Runpeng, et al.
Published: (2025)
by: Dai, Runpeng, et al.
Published: (2025)
Revisiting In-context Learning Inference Circuit in Large Language Models
by: Cho, Hakaze, et al.
Published: (2024)
by: Cho, Hakaze, et al.
Published: (2024)
LLM-NEO: Parameter Efficient Knowledge Distillation for Large Language Models
by: Yang, Runming, et al.
Published: (2024)
by: Yang, Runming, et al.
Published: (2024)
KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
by: Yu, Zhuohao, et al.
Published: (2024)
by: Yu, Zhuohao, et al.
Published: (2024)
Efficient Deployment of Large Language Models on Resource-constrained Devices
by: Yao, Zhiwei, et al.
Published: (2025)
by: Yao, Zhiwei, et al.
Published: (2025)
CORM: Cache Optimization with Recent Message for Large Language Model Inference
by: Dai, Jincheng, et al.
Published: (2024)
by: Dai, Jincheng, et al.
Published: (2024)
FlattenQuant: Breaking Through the Inference Compute-bound for Large Language Models with Per-tensor Quantization
by: Zhang, Yi, et al.
Published: (2024)
by: Zhang, Yi, et al.
Published: (2024)
EfficientLLM: Efficiency in Large Language Models
by: Yuan, Zhengqing, et al.
Published: (2025)
by: Yuan, Zhengqing, et al.
Published: (2025)
ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference
by: Zeng, Ziqian, et al.
Published: (2023)
by: Zeng, Ziqian, et al.
Published: (2023)
Leveraging Large Language Models with Chain-of-Thought and Prompt Engineering for Traffic Crash Severity Analysis and Inference
by: Zhen, Hao, et al.
Published: (2024)
by: Zhen, Hao, et al.
Published: (2024)
Probing the Robustness of Large Language Models Safety to Latent Perturbations
by: Gu, Tianle, et al.
Published: (2025)
by: Gu, Tianle, et al.
Published: (2025)
Optimizing Temperature for Language Models with Multi-Sample Inference
by: Du, Weihua, et al.
Published: (2025)
by: Du, Weihua, et al.
Published: (2025)
DivLogicEval: A Framework for Benchmarking Logical Reasoning Evaluation in Large Language Models
by: Chung, Tsz Ting, et al.
Published: (2025)
by: Chung, Tsz Ting, et al.
Published: (2025)
Extracting Training Data from Diffusion Language Models via Infilling
by: Wang, Yihan, et al.
Published: (2026)
by: Wang, Yihan, et al.
Published: (2026)
Squeeze the Soaked Sponge: Efficient Off-policy Reinforcement Finetuning for Large Language Model
by: Liang, Jing, et al.
Published: (2025)
by: Liang, Jing, et al.
Published: (2025)
Language-Image Models with 3D Understanding
by: Cho, Jang Hyun, et al.
Published: (2024)
by: Cho, Jang Hyun, et al.
Published: (2024)
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
by: Chen, Mengzhao, et al.
Published: (2024)
by: Chen, Mengzhao, et al.
Published: (2024)
Similar Items
-
Delta Knowledge Distillation for Large Language Models
by: Cao, Yihan, et al.
Published: (2025) -
Scaling Inference-Efficient Language Models
by: Bian, Song, et al.
Published: (2025) -
Benchmarking Benchmark Leakage in Large Language Models
by: Xu, Ruijie, et al.
Published: (2024) -
KDFlow: A User-Friendly and Efficient Knowledge Distillation Framework for Large Language Models
by: Zhang, Songming, et al.
Published: (2026) -
Instruction Mining: Instruction Data Selection for Tuning Large Language Models
by: Cao, Yihan, et al.
Published: (2023)