:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Jiang, Yuhang
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Computation and Language
Online Access:	https://arxiv.org/abs/2606.00926
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Task-driven Layerwise Additive Activation Intervention
by: Nguyen, Hieu Trung, et al.
Published: (2025)

From Compression to Expression: A Layerwise Analysis of In-Context Learning
by: Jiang, Jiachen, et al.
Published: (2025)

Explicitly Encoding Structural Symmetry is Key to Length Generalization in Arithmetic Tasks
by: Sabbaghi, Mahdi, et al.
Published: (2024)

Detection vs. Execution: Single-Bucket Probes Miss Half the Mamba-2 State Sink
by: Jiang, Yuhang
Published: (2026)

Layerwise Recall and the Geometry of Interwoven Knowledge in LLMs
by: Lei, Ge, et al.
Published: (2025)

Outlier-weighed Layerwise Sampling for LLM Fine-tuning
by: Li, Pengxiang, et al.
Published: (2024)

Layerwise Change of Knowledge in Neural Networks
by: Cheng, Xu, et al.
Published: (2024)

Encoding Agent Trajectories as Representations with Sequence Transformers
by: Tsiligkaridis, Athanasios, et al.
Published: (2024)

Multilingual Language Models Encode Script Over Linguistic Structure
by: Verma, Aastha A K, et al.
Published: (2026)

Adaptive Large Language Models By Layerwise Attention Shortcuts
by: Verma, Prateek, et al.
Published: (2024)

R2T: Rule-Encoded Loss Functions for Low-Resource Sequence Tagging
by: Keita, Mamadou K., et al.
Published: (2025)

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
by: Pan, Rui, et al.
Published: (2024)

The UNDO Flip-Flop: A Controlled Probe for Reversible Semantic State Management in State Space Model
by: Zhou, Hongxu
Published: (2026)

GLiClass: Generalist Lightweight Model for Sequence Classification Tasks
by: Stepanov, Ihor, et al.
Published: (2025)

Pretrained Generative Language Models as General Learning Frameworks for Sequence-Based Tasks
by: Fauber, Ben
Published: (2024)

Structured Recurrent Mixers for Massively Parallelized Sequence Generation
by: Badger, Benjamin L.
Published: (2026)

What Do Language Models Learn in Context? The Structured Task Hypothesis
by: Li, Jiaoda, et al.
Published: (2024)

Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
by: Li, Kenneth, et al.
Published: (2022)

Sequences of Logits Reveal the Low Rank Structure of Language Models
by: Golowich, Noah, et al.
Published: (2025)

On the "Induction Bias" in Sequence Models
by: Ebrahimi, M. Reza, et al.
Published: (2026)

Assessing Episodic Memory in LLMs with Sequence Order Recall Tasks
by: Pink, Mathis, et al.
Published: (2024)

Plan, Verify and Fill: A Structured Parallel Decoding Approach for Diffusion Language Models
by: Li, Miao, et al.
Published: (2026)

Struc-EMB: The Potential of Structure-Aware Encoding in Language Embeddings
by: Liu, Shikun, et al.
Published: (2025)

Training Large Reasoning Models Efficiently via Progressive Thought Encoding
by: Zhang, Zeliang, et al.
Published: (2026)

On the Geometry of Positional Encodings in Transformers
by: Cirrincione, Giansalvo
Published: (2026)

Large Language Models Encode Semantics and Alignment in Linearly Separable Representations
by: Saglam, Baturay, et al.
Published: (2025)

SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models
by: Shen, Shuaijie, et al.
Published: (2024)

Sparse Autoencoder Decomposition of Clinical Sequence Model Representations: Feature Complexity, Task Specialisation, and Mortality Prediction
by: Sainsbury, Chris, et al.
Published: (2026)

Language Models over Canonical Byte-Pair Encodings
by: Vieira, Tim, et al.
Published: (2025)

Sequence-to-Sequence Spanish Pre-trained Language Models
by: Araujo, Vladimir, et al.
Published: (2023)

Long-range Modeling and Processing of Multimodal Event Sequences
by: Li, Jichu, et al.
Published: (2026)

ParaScopes: What do Language Models Activations Encode About Future Text?
by: Pochinkov, Nicky, et al.
Published: (2025)

Transforming Chatbot Text: A Sequence-to-Sequence Approach
by: Reddy, Natesh, et al.
Published: (2025)

Temporal Tokenization Strategies for Event Sequence Modeling with Large Language Models
by: Liu, Zefang, et al.
Published: (2025)

Mitigating Reversal Curse in Large Language Models via Semantic-aware Permutation Training
by: Guo, Qingyan, et al.
Published: (2024)

Neural Sequence-to-Sequence Modeling with Attention by Leveraging Deep Learning Architectures for Enhanced Contextual Understanding in Abstractive Text Summarization
by: Challagundla, Bhavith Chandra, et al.
Published: (2024)

Learning-Time Encoding Shapes Unlearning in LLMs
by: Wu, Ruihan, et al.
Published: (2025)

The Necessity of Imperfection:Reversing Model Collapse via Simulating Cognitive Boundedness
by: Jiang, Zhongjie
Published: (2025)

Insertion Language Models: Sequence Generation with Arbitrary-Position Insertions
by: Patel, Dhruvesh, et al.
Published: (2025)

Enhanced Structured State Space Models via Grouped FIR Filtering and Attention Sink Mechanisms
by: Meng, Tian, et al.
Published: (2024)