Guardado en:
| Autores principales: | Ghaffari, Alireza, Younesian, Sharareh, Nia, Vahid Partovi, Chen, Boxing, Asgharian, Masoud |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2405.13358 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
por: Ghaffari, Alireza, et al.
Publicado: (2025)
por: Ghaffari, Alireza, et al.
Publicado: (2025)
OAC: Output-adaptive Calibration for Accurate Post-training Quantization
por: Edalati, Ali, et al.
Publicado: (2024)
por: Edalati, Ali, et al.
Publicado: (2024)
Mitigating Outlier Activations in Low-Precision Fine-Tuning of Language Models
por: Ghaffari, Alireza, et al.
Publicado: (2023)
por: Ghaffari, Alireza, et al.
Publicado: (2023)
Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning
por: Gu, Yu, et al.
Publicado: (2026)
por: Gu, Yu, et al.
Publicado: (2026)
PoTPTQ: A Two-step Power-of-Two Post-training for LLMs
por: Wang, Xinyu, et al.
Publicado: (2025)
por: Wang, Xinyu, et al.
Publicado: (2025)
MoKA: Mixture of Kronecker Adapters
por: Sadeghi, Mohammadreza, et al.
Publicado: (2025)
por: Sadeghi, Mohammadreza, et al.
Publicado: (2025)
Tiny Noise-Robust Voice Activity Detector for Voice Assistants
por: Asl, Hamed Jafarzadeh, et al.
Publicado: (2025)
por: Asl, Hamed Jafarzadeh, et al.
Publicado: (2025)
HyWA: Hypernetwork Weight Adapting Personalized Voice Activity Detection
por: Nejad, Mahsa Ghazvini, et al.
Publicado: (2025)
por: Nejad, Mahsa Ghazvini, et al.
Publicado: (2025)
Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers
por: Lu, Yiwei, et al.
Publicado: (2024)
por: Lu, Yiwei, et al.
Publicado: (2024)
Asynchronous Reasoning: Training-Free Interactive Thinking LLMs
por: Yakushev, George, et al.
Publicado: (2025)
por: Yakushev, George, et al.
Publicado: (2025)
Enterprise Resource Planning Using Multi-type Transformers in Ferro-Titanium Industry
por: Yazdanpourmoghadam, Samira, et al.
Publicado: (2026)
por: Yazdanpourmoghadam, Samira, et al.
Publicado: (2026)
Tensor Train Recurrent Network Language Model Prediction
por: Alejandro Murua‐Sazo, et al.
Publicado: (2025)
por: Alejandro Murua‐Sazo, et al.
Publicado: (2025)
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
por: Huang, Wei, et al.
Publicado: (2024)
por: Huang, Wei, et al.
Publicado: (2024)
On the Impact of Calibration Data in Post-training Quantization and Pruning
por: Williams, Miles, et al.
Publicado: (2023)
por: Williams, Miles, et al.
Publicado: (2023)
Accelerated Gradient Methods for Sparse Statistical Learning with Nonconvex Penalties
por: Yang, Kai, et al.
Publicado: (2020)
por: Yang, Kai, et al.
Publicado: (2020)
AGoQ: Activation and Gradient Quantization for Memory-Efficient Distributed Training of LLMs
por: Lin, Wenxiang, et al.
Publicado: (2026)
por: Lin, Wenxiang, et al.
Publicado: (2026)
BOSCH: Black-Box Binary Optimization for Short-Context Attention-Head Selection in LLMs
por: Ghaddar, Abbas, et al.
Publicado: (2026)
por: Ghaddar, Abbas, et al.
Publicado: (2026)
From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling
por: Rahimzadeh, Vahid, et al.
Publicado: (2025)
por: Rahimzadeh, Vahid, et al.
Publicado: (2025)
Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing
por: Askari, Hadi, et al.
Publicado: (2024)
por: Askari, Hadi, et al.
Publicado: (2024)
Sandwich attack: Multi-language Mixture Adaptive Attack on LLMs
por: Upadhayay, Bibek, et al.
Publicado: (2024)
por: Upadhayay, Bibek, et al.
Publicado: (2024)
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs
por: Lin, Haokun, et al.
Publicado: (2025)
por: Lin, Haokun, et al.
Publicado: (2025)
Zero-shot Slot Filling in the Age of LLMs for Dialogue Systems
por: Rana, Mansi, et al.
Publicado: (2024)
por: Rana, Mansi, et al.
Publicado: (2024)
The Uneven Impact of Post-Training Quantization in Machine Translation
por: Marie, Benjamin, et al.
Publicado: (2025)
por: Marie, Benjamin, et al.
Publicado: (2025)
Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity
por: Metel, Michael R., et al.
Publicado: (2024)
por: Metel, Michael R., et al.
Publicado: (2024)
Can Post-Training Quantization Benefit from an Additional QLoRA Integration?
por: Zhu, Xiliang, et al.
Publicado: (2025)
por: Zhu, Xiliang, et al.
Publicado: (2025)
OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization
por: Gadhikar, Advait, et al.
Publicado: (2025)
por: Gadhikar, Advait, et al.
Publicado: (2025)
Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs
por: Nakkiran, Preetum, et al.
Publicado: (2025)
por: Nakkiran, Preetum, et al.
Publicado: (2025)
Zero-shot and Few-shot Learning with Instruction-following LLMs for Claim Matching in Automated Fact-checking
por: Pisarevskaya, Dina, et al.
Publicado: (2025)
por: Pisarevskaya, Dina, et al.
Publicado: (2025)
SpeechLLMs for Large-scale Contextualized Zero-shot Slot Filling
por: Hacioglu, Kadri, et al.
Publicado: (2025)
por: Hacioglu, Kadri, et al.
Publicado: (2025)
Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMs
por: Li, Hanqing, et al.
Publicado: (2025)
por: Li, Hanqing, et al.
Publicado: (2025)
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
por: Ouyang, Xu, et al.
Publicado: (2024)
por: Ouyang, Xu, et al.
Publicado: (2024)
Can Post-Training Transform LLMs into Causal Reasoners?
por: Chen, Junqi, et al.
Publicado: (2026)
por: Chen, Junqi, et al.
Publicado: (2026)
Scaling Laws for Post Training Quantized Large Language Models
por: Xu, Zifei, et al.
Publicado: (2024)
por: Xu, Zifei, et al.
Publicado: (2024)
AgentKernelArena: Generalization-Aware Benchmarking of GPU Kernel Optimization Agents
por: Younesian, Sharareh, et al.
Publicado: (2026)
por: Younesian, Sharareh, et al.
Publicado: (2026)
Low-Rank Quantization-Aware Training for LLMs
por: Bondarenko, Yelysei, et al.
Publicado: (2024)
por: Bondarenko, Yelysei, et al.
Publicado: (2024)
Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
por: Paglieri, Davide, et al.
Publicado: (2024)
por: Paglieri, Davide, et al.
Publicado: (2024)
SignRoundV2: Toward Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
por: Cheng, Wenhua, et al.
Publicado: (2025)
por: Cheng, Wenhua, et al.
Publicado: (2025)
OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection
por: Huang, Chenyang, et al.
Publicado: (2024)
por: Huang, Chenyang, et al.
Publicado: (2024)
Single Parent Family: A Spectrum of Family Members from a Single Pre-Trained Foundation Model
por: Hajimolahoseini, Habib, et al.
Publicado: (2024)
por: Hajimolahoseini, Habib, et al.
Publicado: (2024)
Batch-Max: Higher LLM Throughput using Larger Batch Sizes and KV Cache Compression
por: Metel, Michael R., et al.
Publicado: (2024)
por: Metel, Michael R., et al.
Publicado: (2024)
Ejemplares similares
-
Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
por: Ghaffari, Alireza, et al.
Publicado: (2025) -
OAC: Output-adaptive Calibration for Accurate Post-training Quantization
por: Edalati, Ali, et al.
Publicado: (2024) -
Mitigating Outlier Activations in Low-Precision Fine-Tuning of Language Models
por: Ghaffari, Alireza, et al.
Publicado: (2023) -
Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning
por: Gu, Yu, et al.
Publicado: (2026) -
PoTPTQ: A Two-step Power-of-Two Post-training for LLMs
por: Wang, Xinyu, et al.
Publicado: (2025)