:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Tseng, Albert, Yu, Tao, Park, Youngsuk
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2502.20586
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Stochastic Rounding for LLM Training: Theory and Practice
by: Ozkara, Kaan, et al.
Published: (2025)

Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs
by: Bian, Song, et al.
Published: (2025)

Oscillation-Reduced MXFP4 Training for Vision Transformers
by: Chen, Yuxiang, et al.
Published: (2025)

TritonRL: Training LLMs to Think and Code Triton Without Cheating
by: Woo, Jiin, et al.
Published: (2025)

Recipes for Pre-training LLMs with MXFP8
by: Mishra, Asit, et al.
Published: (2025)

Not-a-Bandit: Provably No-Regret Drafter Selection in Speculative Decoding for LLMs
by: Liu, Hongyi, et al.
Published: (2025)

MuonBP: Faster Muon via Block-Periodic Orthogonalization
by: Khaled, Ahmed, et al.
Published: (2025)

Block Rotation is All You Need for MXFP4 Quantization
by: Shao, Yuantian, et al.
Published: (2025)

TORQ: Two-Level Orthogonal Rotation for MXFP4 Quantization
by: Xu, Zukang, et al.
Published: (2026)

Pretraining large language models with MXFP4 on Native FP4 Hardware
by: Cim, Musa, et al.
Published: (2026)

Unveiling the Potential of Quantization with MXFP4: Strategies for Quantization Error Reduction
by: Chhugani, Jatin, et al.
Published: (2026)

Collage: Light-Weight Low-Precision Strategy for LLM Training
by: Yu, Tao, et al.
Published: (2024)

ProxSparse: Regularized Learning of Semi-Structured Sparsity Masks for Pretrained LLMs
by: Liu, Hongyi, et al.
Published: (2025)

Online Posterior Sampling with a Diffusion Prior
by: Kveton, Branislav, et al.
Published: (2024)

MXNorm: Reusing MXFP block scales for efficient tensor normalisation
by: McLean, Callum, et al.
Published: (2026)

Diagonal-Tiled Mixed-Precision Attention for Efficient Low-Bit MXFP Inference
by: Ding, Yifu, et al.
Published: (2026)

Decomposing MXFP4 quantization error for LLM reinforcement learning: reducible bias, recoverable deadzone, and an irreducible floor
by: Li, Xiaocan, et al.
Published: (2026)

Shadow Cones: A Generalized Framework for Partial Order Embeddings
by: Yu, Tao, et al.
Published: (2023)

Theoretical Guarantees of Learning Ensembling Strategies with Applications to Time Series Forecasting
by: Hasson, Hilaf, et al.
Published: (2023)

Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
by: Gautam, Tanmay, et al.
Published: (2024)

L$^3$: Large Lookup Layers
by: Tseng, Albert, et al.
Published: (2026)

Directional Alignment Mitigates Reward Hacking in Reinforcement Learning for Language Models
by: Deng, Wenlong, et al.
Published: (2026)

RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models
by: Wei, Quan, et al.
Published: (2025)

Verifier-free Test-Time Sampling for Vision Language Action Models
by: Jang, Suhyeok, et al.
Published: (2025)

Model-Preserving Adaptive Rounding
by: Tseng, Albert, et al.
Published: (2025)

Metis: Training LLMs with FP4 Quantization
by: Cao, Hengjie, et al.
Published: (2025)

Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
by: Ouyang, Xu, et al.
Published: (2024)

QTIP: Quantization with Trellises and Incoherence Processing
by: Tseng, Albert, et al.
Published: (2024)

MuCon: Clipped Muon Updates for LLM Training
by: Yi, Albert
Published: (2026)

Training Dynamics Impact Post-Training Quantization Robustness
by: Catalan-Tatjer, Albert, et al.
Published: (2025)

Inference Optimization of Foundation Models on AI Accelerators
by: Park, Youngsuk, et al.
Published: (2024)

MedM2T: A MultiModal Framework for Time-Aware Modeling with Electronic Health Record and Electrocardiogram Data
by: Kuo, Yu-Chen, et al.
Published: (2025)

Learning-Based WiFi Fingerprint Inpainting via Generative Adversarial Networks
by: Chan, Yu, et al.
Published: (2024)

StreetMath: Study of LLMs' Approximation Behaviors
by: Tseng, Chiung-Yi, et al.
Published: (2025)

Laplace Approximation For Tensor Train Kernel Machines In System Identification
by: Saiapin, Albert, et al.
Published: (2025)

Physics-Informed Neural Network for Predicting Out-of-Training-Range TCAD Solution with Minimized Domain Expertise
by: Lu, Albert, et al.
Published: (2024)

FP4 All the Way: Fully Quantized Training of LLMs
by: Chmiel, Brian, et al.
Published: (2025)

CAAP: Class-Dependent Automatic Data Augmentation Based On Adaptive Policies For Time Series
by: Chang, Tien-Yu, et al.
Published: (2024)

Test-Time Training on Graphs with Large Language Models (LLMs)
by: Zhang, Jiaxin, et al.
Published: (2024)

LLM4TS: Aligning Pre-Trained LLMs as Data-Efficient Time-Series Forecasters
by: Chang, Ching, et al.
Published: (2023)