:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cook, Jack, Guo, Junxian, Xiao, Guangxuan, Lin, Yujun, Wyss, Keith, Nazemi, Mahdi, Mishra, Asit, del Mundo, Carlo, Blankevoort, Tijmen, Han, Song
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2512.02010
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SOAR: Scale Optimization for Accurate Reconstruction in NVFP4 Quantization
by: Bao, Chengzhu, et al.
Published: (2026)

Adaptive Block-Scaled Data Types
by: Cook, Jack, et al.
Published: (2026)

Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery
by: Xin, Meng, et al.
Published: (2026)

Pruning vs Quantization: Which is Better?
by: Kuzmin, Andrey, et al.
Published: (2023)

Optimizing Mixture of Block Attention
by: Xiao, Guangxuan, et al.
Published: (2025)

FP8 Quantization: The Power of the Exponent
by: Kuzmin, Andrey, et al.
Published: (2022)

Data Repetition Beats Data Scaling in Long-CoT Supervised Fine-Tuning
by: Kopiczko, Dawid J., et al.
Published: (2026)

Bitune: Leveraging Bidirectional Attention to Improve Decoder-Only LLMs
by: Kopiczko, Dawid J., et al.
Published: (2024)

VeRA: Vector-based Random Matrix Adaptation
by: Kopiczko, Dawid J., et al.
Published: (2023)

XAttention: Block Sparse Attention with Antidiagonal Scoring
by: Xu, Ruyi, et al.
Published: (2025)

Pretraining Large Language Models with NVFP4
by: NVIDIA, et al.
Published: (2025)

GPTVQ: The Blessing of Dimensionality for LLM Quantization
by: van Baalen, Mart, et al.
Published: (2024)

NeuroBlend: Towards Low-Power yet Accurate Neural Network-Based Inference Engine Blending Binary and Fixed-Point Convolutions
by: Fayyazi, Arash, et al.
Published: (2023)

Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation
by: Azizi, Seyedarmin, et al.
Published: (2023)

MixFP4: Enhancing NVFP4 with Adaptive FP4/INT4 Block Representations
by: Zou, Jiaxiang, et al.
Published: (2026)

ARCQuant: Boosting NVFP4 Quantization with Augmented Residual Channels for LLMs
by: Meng, Haoqian, et al.
Published: (2026)

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
by: Xiao, Guangxuan, et al.
Published: (2022)

FAAR: Format-Aware Adaptive Rounding for NVFP4
by: Li, Hanglin, et al.
Published: (2026)

Think Big, Generate Quick: LLM-to-SLM for Fast Autoregressive Decoding
by: Bergner, Benjamin, et al.
Published: (2024)

Elastic ViTs from Pretrained Models without Retraining
by: Simoncini, Walter, et al.
Published: (2025)

InterroGate: Learning to Share, Specialize, and Prune Representations for Multi-task Learning
by: Bejnordi, Babak Ehteshami, et al.
Published: (2024)

ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization
by: Liu, Zechun, et al.
Published: (2025)

Accurate Block Quantization in LLMs with Outliers
by: Trukhanov, Nikita, et al.
Published: (2024)

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving
by: Lin, Yujun, et al.
Published: (2024)

RaZeR: Pushing the Limits of NVFP4 Quantization with Redundant Zero Remapping
by: Chen, Yuzong, et al.
Published: (2025)

Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy
by: Azizi, Seyedarmin, et al.
Published: (2024)

Quartet II: Accurate LLM Pre-Training in NVFP4 by Improved Unbiased Gradient Estimation
by: Panferov, Andrei, et al.
Published: (2026)

The LLM Surgeon
by: van der Ouderaa, Tycho F. A., et al.
Published: (2023)

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
by: Yang, Shang, et al.
Published: (2025)

Bifurcation and Quasiperiodic Behaviors of Ion Acoustic Waves in Magnetoplasmas with Nonthermal Electrons Featuring Tsallis Distribution
by: Asit Saha
Published: (2015)

Solitonic, Periodic and Quasiperiodic Behaviors of Dust Ion Acoustic Waves in Superthermal Plasmas
by: Asit Saha
Published: (2015)

Dynamic Motions of Ion Acoustic Waves in Plasmas with Superthermal Electrons
by: Asit Saha
Published: (2015)

Recipes for Pre-training LLMs with MXFP8
by: Mishra, Asit, et al.
Published: (2025)

SpinQuant: LLM quantization with learned rotations
by: Liu, Zechun, et al.
Published: (2024)

LAUREANO CASTRO NOGUEIRA, LUIS CASTRO NOGUEIRA Y MIGUEL ÁNGEL CASTRO NOGUEIRA, ¿Quién teme a la naturaleza humana? Madrid: Tecnos, 2008
by: Jordi Mundó
Published: (2010)

Análisis crítico de la evolución de la anemia y la deficiencia de micronutrimientos en la población
by: Verónica Mundo
Published: (2007)

Simposio: Educación, convivencia e instituciones. Pilares de una visión compartida
by: Mabel Mundó
Published: (2012)

TetraJet-v2: Accurate NVFP4 Training for Large Language Models with Oscillation Suppression and Outlier Control
by: Chen, Yuxiang, et al.
Published: (2025)

Dissecting Outlier Dynamics in LLM NVFP4 Pretraining
by: Dong, Peijie, et al.
Published: (2026)

Biological activities (antibacterial, antifungal and cytotoxic) of secondary metabolites of Ircinia spp.
by: Nazemi, Melika
Published: (2013)