Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Sagić, Andrija
Format: Recurso digital
Sprache:Englisch
Veröffentlicht: Zenodo 2026
Schlagworte:
Online-Zugang:https://doi.org/10.5281/zenodo.19409889
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Inhaltsangabe:
  • <p>While Chain-of-Thought (CoT) prompting has dramatically improved the reasoning capabilities of Large<br>Language Models (LLMs), it introduces severe computational inefficiencies through excessive token gen-<br>eration. Two dominant failure patterns are detected: “Token Wall” failures (endless generative loops of<br>self-doubt and conversational filler) and overly linear step-by-step or dotted-checklist reasoning before<br>reaching a conclusion. To address this, Clear Chain-of-Thought (ClearCoT) is introduced, a methodology<br>based on the philosophical principle of “clear and precise thinking” that enforces a rigorous structural<br>connection between the Question, Reasoning process, and Answer (Q–R–A).<br>A novel perspective tested and executed entirely on consumer-grade local hardware (24 GB VRAM).<br>First, a 27B-parameter Teacher model synthesizes philosophical reasoning pairs, which are used to train<br>a 4B-parameter MasterMind Model. This MasterMind subsequently acts as a “logical optimizer” tool, re-<br>structuring publicly available multi-domain datasets into highly logically optimized examples used to<br>fine-tune a 2B-parameter test model (Occam-2B).<br>Evaluated against the base model, Occam-2B demonstrates a statistically significant 6.71 % accuracy<br>increase on MMLU-Pro (P < 0.0001) while drastically reducing 95th-percentile token bloat by approx-<br>imately 69 %. On complex logic tasks, the ClearCoT methodology eliminates up to 89.4 % of reason-<br>ing tokens and almost entirely resolves catastrophic “Token Wall” failure loops. Ultimately, this paper<br>demonstrates that prioritizing logical structure over data volume enables small models to achieve rea-<br>soning precision, establishing the MasterMind Model as a highly effective, reusable tool for optimizing<br>any available dataset featuring a <think> process. The resulting Occam-2B model is publicly available<br>at <a href="https://huggingface.co/collections/Sagicc/occam-clearcot" target="_blank" rel="noopener">huggingface.co/collections/Sagicc/occam-clearcot</a>.</p>