:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Das, Susmit
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Computation and Language I.2.7; I.2.6
Online Access:	https://arxiv.org/abs/2601.05300
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models
by: Fadli, Samih
Published: (2025)

Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation
by: Bianchessi, Arthur S., et al.
Published: (2025)

SPHERICAL KV: Angle-Domain Attention and Rate-Distortion Retention for Efficient Long-Context Inference
by: Chauhan, Anay, et al.
Published: (2026)

Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
by: Huang, Yunpeng, et al.
Published: (2023)

Towards Resource-Efficient Multimodal Intelligence: Learned Routing among Specialized Expert Models
by: Saini, Mayank, et al.
Published: (2025)

HyDRA: Hybrid Dynamic Routing Architecture for Heterogeneous LLM Pools
by: Garg, Aashna, et al.
Published: (2026)

Quantization-Robust LLM Unlearning via Low-Rank Adaptation
by: Abitante, João Vitor Boer, et al.
Published: (2026)

Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning
by: Steele, Brady
Published: (2026)

OCRR: A Benchmark for Online Correction Recovery under Distribution Shift
by: Grassi, Adrian
Published: (2026)

Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility
by: Szilvasy, Gergely, et al.
Published: (2026)

Ouroboros: Dynamic Weight Generation for Recursive Transformers via Input-Conditioned LoRA Modulation
by: Jaber, Jaber, et al.
Published: (2026)

Neural Activation Patterns Across Language Model Architectures: A Comprehensive Analysis of Cognitive Task Performance
by: Naser-Moghadasi, Mahdi, et al.
Published: (2026)

Representation-Aware Unlearning via Activation Signatures: From Suppression to Entity-Signature Erasure
by: Mahmood, Syed Naveed, et al.
Published: (2026)

Continuous-Depth Transformers with Learned Control Dynamics
by: Jemley, Peter
Published: (2026)

Are LLM Uncertainty and Correctness Encoded by the Same Features? A Functional Dissociation via Sparse Autoencoders
by: Patel, Het, et al.
Published: (2026)

Memory Bank Compression for Continual Adaptation of Large Language Models
by: Katraouras, Thomas, et al.
Published: (2026)

Kronecker Embeddings: Byte-Level Structured Token Representations for Parameter-Efficient Language Models
by: Shravan, Rohan
Published: (2026)

How Language Models Process Out-of-Distribution Inputs: A Two-Pathway Framework
by: Saghir, Hamidreza
Published: (2026)

Stratified Hazard Sampling: Minimal-Variance Event Scheduling for CTMC/DTMC Discrete Diffusion and Flow Models
by: Jang, Seunghwan, et al.
Published: (2026)

Mean-Pooled Cosine Similarity is Not Length-Invariant: Theory and Cross-Domain Evidence for a Length-Invariant Alternative
by: Mitra, Sibayan, et al.
Published: (2026)

The Right Answer, the Wrong Direction: Why Transformers Fail at Counting and How to Fix It
by: Garcia, Gabriel
Published: (2026)

Discovering Transformer Circuits via a Hybrid Attribution and Pruning Framework
by: Gu, Hao, et al.
Published: (2025)

Future Token Prediction -- Causal Language Modelling with Per-Token Semantic State Vector for Multi-Token Prediction
by: Walker, Nicholas
Published: (2024)

When Models Can't Follow: Testing Instruction Adherence Across 256 LLMs
by: Young, Richard J., et al.
Published: (2025)

PersonalLLM: Tailoring LLMs to Individual Preferences
by: Zollo, Thomas P., et al.
Published: (2024)

Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
by: Zhu, Jiajun, et al.
Published: (2025)

Sarcasm Detection in a Less-Resourced Language
by: Đoković, Lazar, et al.
Published: (2024)

Thread Detection and Response Generation using Transformers with Prompt Optimisation
by: T, Kevin Joshua, et al.
Published: (2024)

The Data Efficiency Frontier of Financial Foundation Models: Scaling Laws from Continued Pretraining
by: Ponnock, Jesse
Published: (2025)

Combining Language and Topic Models for Hierarchical Text Classification
by: Toit, Jaco du, et al.
Published: (2025)

Pre-trained Models Perform the Best When Token Distributions Follow Zipf's Law
by: He, Yanjin, et al.
Published: (2025)

Your Pretrained Model Tells the Difficulty Itself: A Self-Adaptive Curriculum Learning Paradigm for Natural Language Understanding
by: Feng, Qi, et al.
Published: (2025)

Language Models Are Implicitly Continuous
by: Marro, Samuele, et al.
Published: (2025)

LLM Vocabulary Compression for Low-Compute Environments
by: Vennam, Sreeram, et al.
Published: (2024)

In-Context Fixation: When Demonstrated Labels Override Semantics in Few-Shot Classification
by: Liu, Ming
Published: (2026)

Contextual Integrity in LLMs via Reasoning and Reinforcement Learning
by: Lan, Guangchen, et al.
Published: (2025)

Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
by: Abramov, Roman, et al.
Published: (2025)

Counterfactual Likelihood Tests for Indirect Influence in Private Reasoning Channels
by: Lorup, Alexander Boesgaard
Published: (2026)

Beyond Pass@k: Breadth-Depth Metrics for Reasoning Boundaries
by: Dragoi, Marius, et al.
Published: (2025)

Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
by: Lyu, Bohan, et al.
Published: (2024)