Salvato in:
Dettagli Bibliografici
Autori principali: Kuzmenko, Yurii, Trofimova, Olena
Natura: Recurso digital
Lingua:inglese
Pubblicazione: Zenodo 2026
Soggetti:
Accesso online:https://doi.org/10.5281/zenodo.19724945
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
Sommario:
  • <p class="font-claude-response-body break-words whitespace-normal leading-[1.7]"><strong>Companion to:</strong> Ordering Dominance in Low-Data NMT Regimes: Corpus Sequencing Outperforms Contradiction Magnitude (Part I: Structural Ablation Evidence)</p> <p class="font-claude-response-body break-words whitespace-normal leading-[1.7]"><strong>Computational artifact:</strong> Module 3.0 — Regime Isolation and Diagnostic Stability Verification (see Related identifiers)</p> <p class="font-claude-response-body break-words whitespace-normal leading-[1.7]"><strong>Article 2</strong> operationalizes the Behavioral Regimes layer (Layer 2) of the SIF diagnostic pipeline. Building on Part I — which established corpus ordering as the dominant structural factor over contradiction magnitude — the present study asks whether observed stability patterns depend on regime-level training parameters.</p> <p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">By holding all structural components invariant and varying only regularization strength (λ_H, λ_C) and negation exposure timing, we demonstrate that distinct and reproducible operating regimes emerge as a controlled consequence of parameter selection. Three regimes are characterized across 8 structured epochs and 3 fixed random seeds:</p> <ul> <li><strong>Conservative</strong> (λ_H=0.10, no NEG): maximal invariant preservation</li> <li><strong>Balanced</strong> (λ_H=0.25, NEG@epochs 5–6): cross-article reference regime</li> <li><strong>Stress-Tolerant</strong> (λ_H=0.50, NEG@epochs 5–6): bounded sensitivity with history-dependent persistence</li> </ul> <p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">All regimes preserve global embedding invariants — vocabulary-level entropy remains strictly invariant at 10.9724 ± 0.0000 — while differing systematically in sentence-level self-consistency and negation robustness. An artifact-level Extended Stress probe (λ_H=5.0) demonstrates loss–geometry decoupling: a 10× increase in regularization strength causes a ~106% increase in training loss while embedding-level cosine geometry at epoch 8 differs by Δ < 0.001 across all seeds.</p> <p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">Additional controls — ordering randomization and ΔV-topology randomization (100 shuffles, 2000 bootstrap resamples) — confirm that ordering dominance (~0.052–0.106 Δ) exceeds ΔV-scaffold effects (~0.011–0.015 Δ) by approximately 7×.</p> <p class="font-claude-response-body break-words whitespace-normal leading-[1.7]">All findings are strictly bounded to: N=142, EN→UA, MarianMT LoRA fine-tuning, low-data protocol. No generalization to unconstrained corpora, alternative architectures, or open-domain settings is implied.</p>