:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Edalati, Ali, Ghaffari, Alireza, Nejad, Mahsa Ghazvini, Hou, Lu, Chen, Boxing, Asgharian, Masoud, Nia, Vahid Partovi
Formato:	Preprint
Publicado:	2024
Materias:	Machine Learning Computation and Language
Acceso en línea:	https://arxiv.org/abs/2405.15025
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Mitigating Outlier Activations in Low-Precision Fine-Tuning of Language Models
por: Ghaffari, Alireza, et al.
Publicado: (2023)

AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs
por: Ghaffari, Alireza, et al.
Publicado: (2024)

Rethinking Post-Training Quantization: Introducing a Statistical Pre-Calibration Approach
por: Ghaffari, Alireza, et al.
Publicado: (2025)

Tiny Noise-Robust Voice Activity Detector for Voice Assistants
por: Asl, Hamed Jafarzadeh, et al.
Publicado: (2025)

MoKA: Mixture of Kronecker Adapters
por: Sadeghi, Mohammadreza, et al.
Publicado: (2025)

Pause and Reflect: Conformal Aggregation for Chain-of-Thought Reasoning
por: Gu, Yu, et al.
Publicado: (2026)

HyWA: Hypernetwork Weight Adapting Personalized Voice Activity Detection
por: Nejad, Mahsa Ghazvini, et al.
Publicado: (2025)

PoTPTQ: A Two-step Power-of-Two Post-training for LLMs
por: Wang, Xinyu, et al.
Publicado: (2025)

Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers
por: Lu, Yiwei, et al.
Publicado: (2024)

On the Impact of Calibration Data in Post-training Quantization and Pruning
por: Williams, Miles, et al.
Publicado: (2023)

Enterprise Resource Planning Using Multi-type Transformers in Ferro-Titanium Industry
por: Yazdanpourmoghadam, Samira, et al.
Publicado: (2026)

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs
por: Lin, Haokun, et al.
Publicado: (2025)

Visualizing Spatial Point Clouds: A Task-Oriented Taxonomy
por: Partovi, Mahsa, et al.
Publicado: (2025)

Efficient Post-training Quantization with FP8 Formats
por: Shen, Haihao, et al.
Publicado: (2023)

QDyLoRA: Quantized Dynamic Low-Rank Adaptation for Efficient Large Language Model Tuning
por: Rajabzadeh, Hossein, et al.
Publicado: (2024)

From Millions of Tweets to Actionable Insights: Leveraging LLMs for User Profiling
por: Rahimzadeh, Vahid, et al.
Publicado: (2025)

Towards Accurate Post-training Quantization for Diffusion Models
por: Wang, Changyuan, et al.
Publicado: (2023)

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
por: Xiao, Guangxuan, et al.
Publicado: (2022)

Demystifying Domain-adaptive Post-training for Financial LLMs
por: Ke, Zixuan, et al.
Publicado: (2025)

K-Quantization and its Impact on Output Performance
por: Davidsson, Robin Baki, et al.
Publicado: (2026)

ReGLA: Refining Gated Linear Attention
por: Lu, Peng, et al.
Publicado: (2025)

GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling
por: Dadgarnia, Alireza, et al.
Publicado: (2026)

Tensor Train Recurrent Network Language Model Prediction
por: Alejandro Murua‐Sazo, et al.
Publicado: (2025)

Batch-Max: Higher LLM Throughput using Larger Batch Sizes and KV Cache Compression
por: Metel, Michael R., et al.
Publicado: (2024)

Accelerated Gradient Methods for Sparse Statistical Learning with Nonconvex Penalties
por: Yang, Kai, et al.
Publicado: (2020)

Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity
por: Metel, Michael R., et al.
Publicado: (2024)

SLaNC: Static LayerNorm Calibration
por: Salmani, Mahsa, et al.
Publicado: (2024)

STAGE: Simplified Text-Attributed Graph Embeddings Using Pre-trained LLMs
por: Zolnai-Lucas, Aaron, et al.
Publicado: (2024)

On the importance of Data Scale in Pretraining Arabic Language Models
por: Ghaddar, Abbas, et al.
Publicado: (2024)

Integral Transformer: Denoising Attention, Not Too Much Not Too Little
por: Kobyzev, Ivan, et al.
Publicado: (2025)

BOSCH: Black-Box Binary Optimization for Short-Context Attention-Head Selection in LLMs
por: Ghaddar, Abbas, et al.
Publicado: (2026)

Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference
por: Kavehzadeh, Parsa, et al.
Publicado: (2023)

Towards Accurate Post-training Quantization for Reparameterized Models
por: Zhang, Luoming, et al.
Publicado: (2024)

Safety Verification for Evasive Collision Avoidance in Autonomous Vehicles with Enhanced Resolutions
por: Arab, Aliasghar, et al.
Publicado: (2024)

Accurate KV Cache Quantization with Outlier Tokens Tracing
por: Su, Yi, et al.
Publicado: (2025)

Assessing Influential Observations in Pain Prediction using fMRI Data
por: Zhang, Dongliang, et al.
Publicado: (2024)

Detection of Multiple Influential Observations on Model Selection
por: Zhang, Dongliang, et al.
Publicado: (2024)

Transferable Post-training via Inverse Value Learning
por: Lu, Xinyu, et al.
Publicado: (2024)

On Predicting the Post-training Potential of Pre-trained LLMs
por: Li, Xiaoyuan, et al.
Publicado: (2026)

On the Importance of a Multi-Scale Calibration for Quantization
por: Son, Seungwoo, et al.
Publicado: (2026)