Saved in:
| Main Author: | |
|---|---|
| Format: | Recurso digital |
| Language: | English |
| Published: |
Zenodo
2026
|
| Online Access: | https://doi.org/10.5281/zenodo.19938436 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- <p>Companion artifact archive for the paper "IntLLM: Compiler-Verified<br>1.58-Bit Ternary Language Models for Operating-System Kernel Inference"<br>by Muhamad Fajar Putranto (TaxPrime / PrimeCore.id, Jakarta, Indonesia).</p> <p>Contents (3 tarballs, all reproducible from the public fajarkraton/fajarquant<br>repository at the v0.4.0-phase-d release tag):</p> <p> 1. fajarquant-v0.4.0-intllm.tar.gz<br> Source archive of the fajarquant codebase at the IntLLM paper<br> submission revision. Includes Phase D training pipeline, Phase E1<br> bilingual corpus tooling, FajarQuant v3.1 KV cache quantization<br> (Arm B), and verify-intllm-tables R7 audit gate.<br> Generated via: git archive --format=tar.gz HEAD</p> <p> 2. phase-e1-bilingual-corpus-v1.0.tar.gz<br> The 25.67 B-token bilingual Indonesian + English training corpus<br> described in paper §6:<br> - 15.40 B Indonesian tokens (CulturaY ID + FineWeb-2 ID + Wikipedia ID)<br> - 10.27 B English tokens (DKYoon/SlimPajama-6B sliced + reduced)<br> 60:40 ID:EN mix, 0% synthetic, 0.0254% exact-hash dedup rate.<br> License attribution per NOTICE_BILINGUAL_CORPUS_V1 included.</p> <p> 3. intllm-checkpoints-mini-base-medium.tar.gz<br> Trained model checkpoints for Mini (22 M), Base (46 M), and Medium<br> (74 M) variants. Each is the final-step checkpoint from the<br> scaling-chain training runs reported in paper Table 1. PyTorch<br> state_dict format; loadable via intllm.model.HGRNBitForCausalLM.</p> <p>License: Apache 2.0 for code + Apache 2.0 + per-source attribution for<br>corpus (see NOTICE_BILINGUAL_CORPUS_V1) + Apache 2.0 for checkpoints.</p> <p>Reproducibility: every numeric claim in the companion paper is backed<br>by an artifact in this archive + verified by `make verify-intllm-tables<br>--strict` in the source repo (40/40 activated claims pass at submission<br>time).</p>