Saved in:
| Main Authors: | Tseng, Albert, Sun, Qingyao, Hou, David, De Sa, Christopher |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.11235 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
by: Tseng, Albert, et al.
Published: (2024)
by: Tseng, Albert, et al.
Published: (2024)
L$^3$: Large Lookup Layers
by: Tseng, Albert, et al.
Published: (2026)
by: Tseng, Albert, et al.
Published: (2026)
Model-Preserving Adaptive Rounding
by: Tseng, Albert, et al.
Published: (2025)
by: Tseng, Albert, et al.
Published: (2025)
Shadow Cones: A Generalized Framework for Partial Order Embeddings
by: Yu, Tao, et al.
Published: (2023)
by: Yu, Tao, et al.
Published: (2023)
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
by: Chee, Jerry, et al.
Published: (2023)
by: Chee, Jerry, et al.
Published: (2023)
ModuLoRA: Finetuning 2-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers
by: Yin, Junjie, et al.
Published: (2023)
by: Yin, Junjie, et al.
Published: (2023)
Training LLMs with MXFP4
by: Tseng, Albert, et al.
Published: (2025)
by: Tseng, Albert, et al.
Published: (2025)
Incoherence in Goal-Conditioned Autoregressive Models
by: Karwowski, Jacek, et al.
Published: (2025)
by: Karwowski, Jacek, et al.
Published: (2025)
PTQTP: Post-Training Quantization to Trit-Planes for Large Language Models
by: Xiao, He, et al.
Published: (2025)
by: Xiao, He, et al.
Published: (2025)
Non-Determinism and the Lawlessness of Machine Learning Code
by: Cooper, A. Feder, et al.
Published: (2022)
by: Cooper, A. Feder, et al.
Published: (2022)
Preserve-Then-Quantize: Balancing Rank Budgets for Quantization Error Reconstruction in LLMs
by: Cho, Yoonjun, et al.
Published: (2026)
by: Cho, Yoonjun, et al.
Published: (2026)
Training Dynamics Impact Post-Training Quantization Robustness
by: Catalan-Tatjer, Albert, et al.
Published: (2025)
by: Catalan-Tatjer, Albert, et al.
Published: (2025)
Exploring Layer-wise Information Effectiveness for Post-Training Quantization in Small Language Models
by: Xiao, He, et al.
Published: (2025)
by: Xiao, He, et al.
Published: (2025)
BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models
by: Chen, Junyu, et al.
Published: (2026)
by: Chen, Junyu, et al.
Published: (2026)
Gradient Descent on Logistic Regression: Do Large Step-Sizes Work with Data on the Sphere?
by: Meng, Si Yi, et al.
Published: (2025)
by: Meng, Si Yi, et al.
Published: (2025)
Foundations of GenIR
by: Ai, Qingyao, et al.
Published: (2025)
by: Ai, Qingyao, et al.
Published: (2025)
Gradient Descent on Logistic Regression with Non-Separable Data and Large Step Sizes
by: Meng, Si Yi, et al.
Published: (2024)
by: Meng, Si Yi, et al.
Published: (2024)
DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking
by: Turok, Gilad, et al.
Published: (2026)
by: Turok, Gilad, et al.
Published: (2026)
Incoherence as Oracle-less Measure of Error in LLM-Based Code Generation
by: Valentin, Thomas, et al.
Published: (2025)
by: Valentin, Thomas, et al.
Published: (2025)
FlatQuant: Flatness Matters for LLM Quantization
by: Sun, Yuxuan, et al.
Published: (2024)
by: Sun, Yuxuan, et al.
Published: (2024)
A Comparative Analysis of Microrings Based Incoherent Photonic GEMM Accelerators
by: Vatsavai, Sairam Sri, et al.
Published: (2024)
by: Vatsavai, Sairam Sri, et al.
Published: (2024)
Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices
by: Potapczynski, Andres, et al.
Published: (2024)
by: Potapczynski, Andres, et al.
Published: (2024)
Interpretation of the Einstein Field Equation in QTIP Cosmology
by: Choi, Seungyoung
Published: (2026)
by: Choi, Seungyoung
Published: (2026)
STAT: Shrinking Transformers After Training
by: Flynn, Megan, et al.
Published: (2024)
by: Flynn, Megan, et al.
Published: (2024)
Distribution-Free Process Monitoring with Conformal Prediction
by: Burger, Christopher
Published: (2025)
by: Burger, Christopher
Published: (2025)
Masked Vector Quantization
by: Nguyen, David D., et al.
Published: (2023)
by: Nguyen, David D., et al.
Published: (2023)
Restructuring Vector Quantization with the Rotation Trick
by: Fifty, Christopher, et al.
Published: (2024)
by: Fifty, Christopher, et al.
Published: (2024)
Metis: Training LLMs with FP4 Quantization
by: Cao, Hengjie, et al.
Published: (2025)
by: Cao, Hengjie, et al.
Published: (2025)
Option-ID Based Elimination For Multiple Choice Questions
by: Zhu, Zhenhao, et al.
Published: (2025)
by: Zhu, Zhenhao, et al.
Published: (2025)
Subject-Specific Analysis of Self-Initiated Attention Shifts from EEG with Controlled Internal and External Attention Conditions
by: Zeng, Yuwen, et al.
Published: (2026)
by: Zeng, Yuwen, et al.
Published: (2026)
Beyond Experience Retrieval: Learning to Generate Utility-Optimized Structured Experience for Frozen LLMs
by: Li, Xuancheng, et al.
Published: (2026)
by: Li, Xuancheng, et al.
Published: (2026)
Uncertainty Quantification for Multimodal Large Language Models with Incoherence-adjusted Semantic Volume
by: Lau, Gregory Kang Ruey, et al.
Published: (2026)
by: Lau, Gregory Kang Ruey, et al.
Published: (2026)
Causal Post-Processing of Predictive Models
by: Fernández-Loría, Carlos, et al.
Published: (2024)
by: Fernández-Loría, Carlos, et al.
Published: (2024)
OAC: Output-adaptive Calibration for Accurate Post-training Quantization
by: Edalati, Ali, et al.
Published: (2024)
by: Edalati, Ali, et al.
Published: (2024)
Modulated Diffusion: Accelerating Generative Modeling with Modulated Quantization
by: Gao, Weizhi, et al.
Published: (2025)
by: Gao, Weizhi, et al.
Published: (2025)
MedM2T: A MultiModal Framework for Time-Aware Modeling with Electronic Health Record and Electrocardiogram Data
by: Kuo, Yu-Chen, et al.
Published: (2025)
by: Kuo, Yu-Chen, et al.
Published: (2025)
LF2L: Loss Fusion Horizontal Federated Learning Across Heterogeneous Feature Spaces Using External Datasets Effectively: A Case Study in Second Primary Cancer Prediction
by: Lin, Chia-Fu, et al.
Published: (2026)
by: Lin, Chia-Fu, et al.
Published: (2026)
Compute-Optimal Quantization-Aware Training
by: Dremov, Aleksandr, et al.
Published: (2025)
by: Dremov, Aleksandr, et al.
Published: (2025)
Exploiting Chaotic Dynamics as Deep Neural Networks
by: Liu, Shuhong, et al.
Published: (2024)
by: Liu, Shuhong, et al.
Published: (2024)
SKIM: Any-bit Quantization Pushing The Limits of Post-Training Quantization
by: Bai, Runsheng, et al.
Published: (2024)
by: Bai, Runsheng, et al.
Published: (2024)
Similar Items
-
QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
by: Tseng, Albert, et al.
Published: (2024) -
L$^3$: Large Lookup Layers
by: Tseng, Albert, et al.
Published: (2026) -
Model-Preserving Adaptive Rounding
by: Tseng, Albert, et al.
Published: (2025) -
Shadow Cones: A Generalized Framework for Partial Order Embeddings
by: Yu, Tao, et al.
Published: (2023) -
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
by: Chee, Jerry, et al.
Published: (2023)