:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Ghaffari, Alireza, Younesian, Sharareh, Nia, Vahid Partovi, Chen, Boxing, Asgharian, Masoud
Formato:	Preprint
Publicado:	2024
Materias:	Computation and Language
Acceso en línea:	https://arxiv.org/abs/2405.13358
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
por: Ghaffari, Alireza, et al.
Publicado: (2025)

OAC: Output-adaptive Calibration for Accurate Post-training Quantization
por: Edalati, Ali, et al.
Publicado: (2024)

Mitigating Outlier Activations in Low-Precision Fine-Tuning of Language Models
por: Ghaffari, Alireza, et al.
Publicado: (2023)

Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning
por: Gu, Yu, et al.
Publicado: (2026)

PoTPTQ: A Two-step Power-of-Two Post-training for LLMs
por: Wang, Xinyu, et al.
Publicado: (2025)

MoKA: Mixture of Kronecker Adapters
por: Sadeghi, Mohammadreza, et al.
Publicado: (2025)

Tiny Noise-Robust Voice Activity Detector for Voice Assistants
por: Asl, Hamed Jafarzadeh, et al.
Publicado: (2025)

HyWA: Hypernetwork Weight Adapting Personalized Voice Activity Detection
por: Nejad, Mahsa Ghazvini, et al.
Publicado: (2025)

Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers
por: Lu, Yiwei, et al.
Publicado: (2024)

Asynchronous Reasoning: Training-Free Interactive Thinking LLMs
por: Yakushev, George, et al.
Publicado: (2025)

Enterprise Resource Planning Using Multi-type Transformers in Ferro-Titanium Industry
por: Yazdanpourmoghadam, Samira, et al.
Publicado: (2026)

Tensor Train Recurrent Network Language Model Prediction
por: Alejandro Murua‐Sazo, et al.
Publicado: (2025)

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
por: Huang, Wei, et al.
Publicado: (2024)

On the Impact of Calibration Data in Post-training Quantization and Pruning
por: Williams, Miles, et al.
Publicado: (2023)

Accelerated Gradient Methods for Sparse Statistical Learning with Nonconvex Penalties
por: Yang, Kai, et al.
Publicado: (2020)

AGoQ: Activation and Gradient Quantization for Memory-Efficient Distributed Training of LLMs
por: Lin, Wenxiang, et al.
Publicado: (2026)

BOSCH: Black-Box Binary Optimization for Short-Context Attention-Head Selection in LLMs
por: Ghaddar, Abbas, et al.
Publicado: (2026)

From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling
por: Rahimzadeh, Vahid, et al.
Publicado: (2025)

Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing
por: Askari, Hadi, et al.
Publicado: (2024)

Sandwich attack: Multi-language Mixture Adaptive Attack on LLMs
por: Upadhayay, Bibek, et al.
Publicado: (2024)

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs
por: Lin, Haokun, et al.
Publicado: (2025)

Zero-shot Slot Filling in the Age of LLMs for Dialogue Systems
por: Rana, Mansi, et al.
Publicado: (2024)

The Uneven Impact of Post-Training Quantization in Machine Translation
por: Marie, Benjamin, et al.
Publicado: (2025)

Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity
por: Metel, Michael R., et al.
Publicado: (2024)

Can Post-Training Quantization Benefit from an Additional QLoRA Integration?
por: Zhu, Xiliang, et al.
Publicado: (2025)

OptRot: Mitigating Weight Outliers via Data-Free Rotations for Post-Training Quantization
por: Gadhikar, Advait, et al.
Publicado: (2025)

Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs
por: Nakkiran, Preetum, et al.
Publicado: (2025)

Zero-shot and Few-shot Learning with Instruction-following LLMs for Claim Matching in Automated Fact-checking
por: Pisarevskaya, Dina, et al.
Publicado: (2025)

SpeechLLMs for Large-scale Contextualized Zero-shot Slot Filling
por: Hacioglu, Kadri, et al.
Publicado: (2025)

Zero-shot Graph Reasoning via Retrieval Augmented Framework with LLMs
por: Li, Hanqing, et al.
Publicado: (2025)

Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
por: Ouyang, Xu, et al.
Publicado: (2024)

Can Post-Training Transform LLMs into Causal Reasoners?
por: Chen, Junqi, et al.
Publicado: (2026)

Scaling Laws for Post Training Quantized Large Language Models
por: Xu, Zifei, et al.
Publicado: (2024)

AgentKernelArena: Generalization-Aware Benchmarking of GPU Kernel Optimization Agents
por: Younesian, Sharareh, et al.
Publicado: (2026)

Low-Rank Quantization-Aware Training for LLMs
por: Bondarenko, Yelysei, et al.
Publicado: (2024)

Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
por: Paglieri, Davide, et al.
Publicado: (2024)

SignRoundV2: Toward Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs
por: Cheng, Wenhua, et al.
Publicado: (2025)

OTTAWA: Optimal TransporT Adaptive Word Aligner for Hallucination and Omission Translation Errors Detection
por: Huang, Chenyang, et al.
Publicado: (2024)

Single Parent Family: A Spectrum of Family Members from a Single Pre-Trained Foundation Model
por: Hajimolahoseini, Habib, et al.
Publicado: (2024)

Batch-Max: Higher LLM Throughput using Larger Batch Sizes and KV Cache Compression
por: Metel, Michael R., et al.
Publicado: (2024)