Saved in:
| Main Author: | Das, Susmit |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.05300 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models
by: Fadli, Samih
Published: (2025)
by: Fadli, Samih
Published: (2025)
Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation
by: Bianchessi, Arthur S., et al.
Published: (2025)
by: Bianchessi, Arthur S., et al.
Published: (2025)
SPHERICAL KV: Angle-Domain Attention and Rate-Distortion Retention for Efficient Long-Context Inference
by: Chauhan, Anay, et al.
Published: (2026)
by: Chauhan, Anay, et al.
Published: (2026)
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
by: Huang, Yunpeng, et al.
Published: (2023)
by: Huang, Yunpeng, et al.
Published: (2023)
Towards Resource-Efficient Multimodal Intelligence: Learned Routing among Specialized Expert Models
by: Saini, Mayank, et al.
Published: (2025)
by: Saini, Mayank, et al.
Published: (2025)
HyDRA: Hybrid Dynamic Routing Architecture for Heterogeneous LLM Pools
by: Garg, Aashna, et al.
Published: (2026)
by: Garg, Aashna, et al.
Published: (2026)
Quantization-Robust LLM Unlearning via Low-Rank Adaptation
by: Abitante, João Vitor Boer, et al.
Published: (2026)
by: Abitante, João Vitor Boer, et al.
Published: (2026)
Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning
by: Steele, Brady
Published: (2026)
by: Steele, Brady
Published: (2026)
OCRR: A Benchmark for Online Correction Recovery under Distribution Shift
by: Grassi, Adrian
Published: (2026)
by: Grassi, Adrian
Published: (2026)
Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility
by: Szilvasy, Gergely, et al.
Published: (2026)
by: Szilvasy, Gergely, et al.
Published: (2026)
Ouroboros: Dynamic Weight Generation for Recursive Transformers via Input-Conditioned LoRA Modulation
by: Jaber, Jaber, et al.
Published: (2026)
by: Jaber, Jaber, et al.
Published: (2026)
Neural Activation Patterns Across Language Model Architectures: A Comprehensive Analysis of Cognitive Task Performance
by: Naser-Moghadasi, Mahdi, et al.
Published: (2026)
by: Naser-Moghadasi, Mahdi, et al.
Published: (2026)
Representation-Aware Unlearning via Activation Signatures: From Suppression to Entity-Signature Erasure
by: Mahmood, Syed Naveed, et al.
Published: (2026)
by: Mahmood, Syed Naveed, et al.
Published: (2026)
Continuous-Depth Transformers with Learned Control Dynamics
by: Jemley, Peter
Published: (2026)
by: Jemley, Peter
Published: (2026)
Are LLM Uncertainty and Correctness Encoded by the Same Features? A Functional Dissociation via Sparse Autoencoders
by: Patel, Het, et al.
Published: (2026)
by: Patel, Het, et al.
Published: (2026)
Memory Bank Compression for Continual Adaptation of Large Language Models
by: Katraouras, Thomas, et al.
Published: (2026)
by: Katraouras, Thomas, et al.
Published: (2026)
Kronecker Embeddings: Byte-Level Structured Token Representations for Parameter-Efficient Language Models
by: Shravan, Rohan
Published: (2026)
by: Shravan, Rohan
Published: (2026)
How Language Models Process Out-of-Distribution Inputs: A Two-Pathway Framework
by: Saghir, Hamidreza
Published: (2026)
by: Saghir, Hamidreza
Published: (2026)
Stratified Hazard Sampling: Minimal-Variance Event Scheduling for CTMC/DTMC Discrete Diffusion and Flow Models
by: Jang, Seunghwan, et al.
Published: (2026)
by: Jang, Seunghwan, et al.
Published: (2026)
Mean-Pooled Cosine Similarity is Not Length-Invariant: Theory and Cross-Domain Evidence for a Length-Invariant Alternative
by: Mitra, Sibayan, et al.
Published: (2026)
by: Mitra, Sibayan, et al.
Published: (2026)
The Right Answer, the Wrong Direction: Why Transformers Fail at Counting and How to Fix It
by: Garcia, Gabriel
Published: (2026)
by: Garcia, Gabriel
Published: (2026)
Discovering Transformer Circuits via a Hybrid Attribution and Pruning Framework
by: Gu, Hao, et al.
Published: (2025)
by: Gu, Hao, et al.
Published: (2025)
Future Token Prediction -- Causal Language Modelling with Per-Token Semantic State Vector for Multi-Token Prediction
by: Walker, Nicholas
Published: (2024)
by: Walker, Nicholas
Published: (2024)
When Models Can't Follow: Testing Instruction Adherence Across 256 LLMs
by: Young, Richard J., et al.
Published: (2025)
by: Young, Richard J., et al.
Published: (2025)
PersonalLLM: Tailoring LLMs to Individual Preferences
by: Zollo, Thomas P., et al.
Published: (2024)
by: Zollo, Thomas P., et al.
Published: (2024)
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
by: Zhu, Jiajun, et al.
Published: (2025)
by: Zhu, Jiajun, et al.
Published: (2025)
Sarcasm Detection in a Less-Resourced Language
by: Đoković, Lazar, et al.
Published: (2024)
by: Đoković, Lazar, et al.
Published: (2024)
Thread Detection and Response Generation using Transformers with Prompt Optimisation
by: T, Kevin Joshua, et al.
Published: (2024)
by: T, Kevin Joshua, et al.
Published: (2024)
The Data Efficiency Frontier of Financial Foundation Models: Scaling Laws from Continued Pretraining
by: Ponnock, Jesse
Published: (2025)
by: Ponnock, Jesse
Published: (2025)
Combining Language and Topic Models for Hierarchical Text Classification
by: Toit, Jaco du, et al.
Published: (2025)
by: Toit, Jaco du, et al.
Published: (2025)
Pre-trained Models Perform the Best When Token Distributions Follow Zipf's Law
by: He, Yanjin, et al.
Published: (2025)
by: He, Yanjin, et al.
Published: (2025)
Your Pretrained Model Tells the Difficulty Itself: A Self-Adaptive Curriculum Learning Paradigm for Natural Language Understanding
by: Feng, Qi, et al.
Published: (2025)
by: Feng, Qi, et al.
Published: (2025)
Language Models Are Implicitly Continuous
by: Marro, Samuele, et al.
Published: (2025)
by: Marro, Samuele, et al.
Published: (2025)
LLM Vocabulary Compression for Low-Compute Environments
by: Vennam, Sreeram, et al.
Published: (2024)
by: Vennam, Sreeram, et al.
Published: (2024)
In-Context Fixation: When Demonstrated Labels Override Semantics in Few-Shot Classification
by: Liu, Ming
Published: (2026)
by: Liu, Ming
Published: (2026)
Contextual Integrity in LLMs via Reasoning and Reinforcement Learning
by: Lan, Guangchen, et al.
Published: (2025)
by: Lan, Guangchen, et al.
Published: (2025)
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers
by: Abramov, Roman, et al.
Published: (2025)
by: Abramov, Roman, et al.
Published: (2025)
Counterfactual Likelihood Tests for Indirect Influence in Private Reasoning Channels
by: Lorup, Alexander Boesgaard
Published: (2026)
by: Lorup, Alexander Boesgaard
Published: (2026)
Beyond Pass@k: Breadth-Depth Metrics for Reasoning Boundaries
by: Dragoi, Marius, et al.
Published: (2025)
by: Dragoi, Marius, et al.
Published: (2025)
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
by: Lyu, Bohan, et al.
Published: (2024)
by: Lyu, Bohan, et al.
Published: (2024)
Similar Items
-
Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models
by: Fadli, Samih
Published: (2025) -
Bayesian Attention Mechanism: A Probabilistic Framework for Positional Encoding and Context Length Extrapolation
by: Bianchessi, Arthur S., et al.
Published: (2025) -
SPHERICAL KV: Angle-Domain Attention and Rate-Distortion Retention for Efficient Long-Context Inference
by: Chauhan, Anay, et al.
Published: (2026) -
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey
by: Huang, Yunpeng, et al.
Published: (2023) -
Towards Resource-Efficient Multimodal Intelligence: Learned Routing among Specialized Expert Models
by: Saini, Mayank, et al.
Published: (2025)