:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Larsen, Erik
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence Computation and Language I.2.7; I.2.6
Online Access:	https://arxiv.org/abs/2512.12066
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models
by: Fadli, Samih
Published: (2025)

AMEL: Accumulated Message Effects on LLM Judgments
by: Temkit, Sid-Ali
Published: (2026)

TIAR: Trajectory-Informed Advantage Reweighting for LLM Abstention Learning
by: Pan, Muyu, et al.
Published: (2026)

Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning
by: Xu, Shuyao, et al.
Published: (2025)

CorrSteer: Generation-Time LLM Steering via Correlated Sparse Autoencoder Features
by: Cho, Seonglae, et al.
Published: (2025)

LLM attribution analysis across different fine-tuning strategies and model scales for automated code compliance
by: Shi, Jack Wei Lun, et al.
Published: (2026)

Point of Order: Action-Aware LLM Persona Modeling for Realistic Civic Simulation
by: Merrill, Scott, et al.
Published: (2025)

Perturbation Dose Responses in Recursive LLM Loops: Raw Switching, Stochastic Floors, and Persistent Escape under Append, Replace, and Dialog Updates
by: Kaplanski, Pawel
Published: (2026)

Cognitive Load Limits in Large Language Models: Benchmarking Multi-Hop Reasoning
by: Adapala, Sai Teja Reddy
Published: (2025)

Control Reinforcement Learning: Interpretable Token-Level Steering of LLMs via Sparse Autoencoder Features
by: Cho, Seonglae, et al.
Published: (2026)

In-Context Fixation: When Demonstrated Labels Override Semantics in Few-Shot Classification
by: Liu, Ming
Published: (2026)

No Free Swap: Protocol-Dependent Layer Redundancy in Transformers
by: Garcia, Gabriel
Published: (2026)

Beyond Pass@k: Breadth-Depth Metrics for Reasoning Boundaries
by: Dragoi, Marius, et al.
Published: (2025)

Revisiting Intermediate-Layer Matching in Knowledge Distillation: Layer-Selection Strategy Doesn't Matter (Much)
by: Yu, Zony, et al.
Published: (2025)

Painless Activation Steering: An Automated, Lightweight Approach for Post-Training Large Language Models
by: Cui, Sasha, et al.
Published: (2025)

The Last Word Often Wins: A Format Confound in Chain-of-Thought Corruption Studies
by: Garcia, Gabriel
Published: (2026)

Counterfactual Likelihood Tests for Indirect Influence in Private Reasoning Channels
by: Lorup, Alexander Boesgaard
Published: (2026)

Pressure-Testing Deception Probes in LLMs: Scaling, Robustness, and the Geometry of Deceptive Representations
by: Kumar, Sachin
Published: (2026)

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text
by: Zhou, Tianyang, et al.
Published: (2026)

Prototype Transformer: Towards Language Model Architectures Interpretable by Design
by: Yordanov, Yordan, et al.
Published: (2026)

Forget Attention: Importance-Aware Attention Is All You Need
by: Shin, Soohyeong, et al.
Published: (2026)

Enhancing Burmese News Classification with Kolmogorov-Arnold Network Head Fine-tuning
by: Aung, Thura, et al.
Published: (2025)

MaPPO: Maximum a Posteriori Preference Optimization with Prior Knowledge
by: Lan, Guangchen, et al.
Published: (2025)

Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models
by: Zhang, Gongbo, et al.
Published: (2026)

Dodo: Dynamic Contextual Compression for Decoder-only LMs
by: Qin, Guanghui, et al.
Published: (2023)

Model Collapse as Cultural Evolution
by: Guo, Dongxin, et al.
Published: (2026)

Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
by: Lyu, Bohan, et al.
Published: (2024)

EasyMath: A 0-shot Math Benchmark for SLMs
by: Karki, Drishya, et al.
Published: (2025)

DPO Unchained: Your Training Algorithm is Secretly Disentangled in Human Choice Theory
by: Zhou, Wenxuan, et al.
Published: (2025)

Contextual Integrity in LLMs via Reasoning and Reinforcement Learning
by: Lan, Guangchen, et al.
Published: (2025)

Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking
by: Zhang, Liangliang, et al.
Published: (2025)

Generalizing Numerical Reasoning in Table Data through Operation Sketches and Self-Supervised Learning
by: Cho, Hanjun, et al.
Published: (2026)

Beyond Hallucinations: A Composite Score for Measuring Reliability in Open-Source Large Language Models
by: Salla, Rohit Kumar, et al.
Published: (2025)

Language as a Wave Phenomenon: Semantic Phase Locking and Interference in Neural Networks
by: Yıldırım, Alper, et al.
Published: (2025)

Alternating Reinforcement Learning with Contextual Rubric Rewards: Beyond the Scalarization Strategy
by: Lan, Guangchen, et al.
Published: (2026)

Graph Memory Transformer (GMT)
by: Zanarini, Nicola, et al.
Published: (2026)

Domain-Specific Pretraining of Language Models: A Comparative Study in the Medical Field
by: Kerner, Tobias
Published: (2024)

The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary
by: Guo, Dongxin, et al.
Published: (2026)

Self-Training Doesn't Flatten Language -- It Restructures It: Surface Markers Amplify While Deep Syntax Dies
by: Liu, Ming
Published: (2026)

The Readout Shortcut: Positional Number Copying Dominates Arithmetic CoT Readout in Small Language Models
by: Liu, Ming
Published: (2026)