Saved in:
| Main Author: | Wietrzykowski, Tomasz |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.19348 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Attention Sinks as Internal Signals for Hallucination Detection in Large Language Models
by: Binkowski, Jakub, et al.
Published: (2026)
by: Binkowski, Jakub, et al.
Published: (2026)
HiGPT: Heterogeneous Graph Language Model
by: Tang, Jiabin, et al.
Published: (2024)
by: Tang, Jiabin, et al.
Published: (2024)
HMoE: Heterogeneous Mixture of Experts for Language Modeling
by: Wang, An, et al.
Published: (2024)
by: Wang, An, et al.
Published: (2024)
Aligning language models with human preferences
by: Korbak, Tomasz
Published: (2024)
by: Korbak, Tomasz
Published: (2024)
Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network
by: Chen, Lin, et al.
Published: (2024)
by: Chen, Lin, et al.
Published: (2024)
Sparser, Faster, Lighter Transformer Language Models
by: Cetin, Edoardo, et al.
Published: (2026)
by: Cetin, Edoardo, et al.
Published: (2026)
Selective Neuron Amplification in Transformer Language Models
by: Akhtar, Ryyan, et al.
Published: (2026)
by: Akhtar, Ryyan, et al.
Published: (2026)
Mixture of Heterogeneous Grouped Experts for Language Modeling
by: Ma, Zhicheng, et al.
Published: (2026)
by: Ma, Zhicheng, et al.
Published: (2026)
Jamba: A Hybrid Transformer-Mamba Language Model
by: Lieber, Opher, et al.
Published: (2024)
by: Lieber, Opher, et al.
Published: (2024)
Training Language Models with Language Feedback at Scale
by: Scheurer, Jérémy, et al.
Published: (2023)
by: Scheurer, Jérémy, et al.
Published: (2023)
Can Large Language Models Transform Computational Social Science?
by: Ziems, Caleb, et al.
Published: (2023)
by: Ziems, Caleb, et al.
Published: (2023)
SAP: Syntactic Attention Pruning for Transformer-based Language Models
by: Lee, Tzu-Yun, et al.
Published: (2025)
by: Lee, Tzu-Yun, et al.
Published: (2025)
Emergent Stack Representations in Modeling Counter Languages Using Transformers
by: Tiwari, Utkarsh, et al.
Published: (2025)
by: Tiwari, Utkarsh, et al.
Published: (2025)
Core Context Aware Transformers for Long Context Language Modeling
by: Chen, Yaofo, et al.
Published: (2024)
by: Chen, Yaofo, et al.
Published: (2024)
Development of Pre-Trained Transformer-based Models for the Nepali Language
by: Thapa, Prajwal, et al.
Published: (2024)
by: Thapa, Prajwal, et al.
Published: (2024)
Circuit Component Reuse Across Tasks in Transformer Language Models
by: Merullo, Jack, et al.
Published: (2023)
by: Merullo, Jack, et al.
Published: (2023)
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
by: Limisiewicz, Tomasz, et al.
Published: (2024)
by: Limisiewicz, Tomasz, et al.
Published: (2024)
Dynamic Topic Language Model on Heterogeneous Children's Mental Health Clinical Notes
by: Ye, Hanwen, et al.
Published: (2023)
by: Ye, Hanwen, et al.
Published: (2023)
Experience Sharing in Mutual Reinforcement Learning for Heterogeneous Language Models
by: Liu, Xiaoze, et al.
Published: (2026)
by: Liu, Xiaoze, et al.
Published: (2026)
GLUScope: A Tool for Analyzing GLU Neurons in Transformer Language Models
by: Gerstner, Sebastian, et al.
Published: (2026)
by: Gerstner, Sebastian, et al.
Published: (2026)
Correcting Suppressed Log-Probabilities in Language Models with Post-Transformer Adapters
by: Sanchez, Bryan
Published: (2026)
by: Sanchez, Bryan
Published: (2026)
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
by: Mondorf, Philipp, et al.
Published: (2024)
by: Mondorf, Philipp, et al.
Published: (2024)
Flexible Language Modeling in Continuous Space with Transformer-based Autoregressive Flows
by: Zhang, Ruixiang, et al.
Published: (2025)
by: Zhang, Ruixiang, et al.
Published: (2025)
Automatic Pruning of Fine-tuning Datasets for Transformer-based Language Models
by: Tayaranian, Mohammadreza, et al.
Published: (2024)
by: Tayaranian, Mohammadreza, et al.
Published: (2024)
Heterogeneity in Formal Linguistic Competence of Language Models: Is Data the Real Bottleneck?
by: Renduchintala, H S V N S Kowndinya, et al.
Published: (2026)
by: Renduchintala, H S V N S Kowndinya, et al.
Published: (2026)
StreaMulT: Streaming Multimodal Transformer for Heterogeneous and Arbitrary Long Sequential Data
by: Pellegrain, Victor, et al.
Published: (2021)
by: Pellegrain, Victor, et al.
Published: (2021)
LakotaBERT: A Transformer-based Model for Low Resource Lakota Language
by: Parankusham, Kanishka, et al.
Published: (2025)
by: Parankusham, Kanishka, et al.
Published: (2025)
Heterogeneous Value Alignment Evaluation for Large Language Models
by: Zhang, Zhaowei, et al.
Published: (2023)
by: Zhang, Zhaowei, et al.
Published: (2023)
Fast Byte Latent Transformer
by: Kallini, Julie, et al.
Published: (2026)
by: Kallini, Julie, et al.
Published: (2026)
Debiasing Algorithm through Model Adaptation
by: Limisiewicz, Tomasz, et al.
Published: (2023)
by: Limisiewicz, Tomasz, et al.
Published: (2023)
Revisiting the Shape Convention of Transformer Language Models
by: Liao, Feng-Ting, et al.
Published: (2026)
by: Liao, Feng-Ting, et al.
Published: (2026)
RVPO: Risk-Sensitive Alignment via Variance Regularization
by: Montero, Ivan, et al.
Published: (2026)
by: Montero, Ivan, et al.
Published: (2026)
Partially Rewriting a Transformer in Natural Language
by: Paulo, Gonçalo, et al.
Published: (2025)
by: Paulo, Gonçalo, et al.
Published: (2025)
Contextual Graph Transformer: A Small Language Model for Enhanced Engineering Document Information Extraction
by: Reddy, Karan, et al.
Published: (2025)
by: Reddy, Karan, et al.
Published: (2025)
CMLFormer: A Dual Decoder Transformer with Switching Point Learning for Code-Mixed Language Modeling
by: Baral, Aditeya, et al.
Published: (2025)
by: Baral, Aditeya, et al.
Published: (2025)
Modeling Bilingual Sentence Processing: Evaluating RNN and Transformer Architectures for Cross-Language Structural Priming
by: Zhang, Demi, et al.
Published: (2024)
by: Zhang, Demi, et al.
Published: (2024)
Value-Aware Numerical Representations for Transformer Language Models
by: Dutulescu, Andreea, et al.
Published: (2026)
by: Dutulescu, Andreea, et al.
Published: (2026)
Latent Concept Disentanglement in Transformer-based Language Models
by: Hong, Guan Zhe, et al.
Published: (2025)
by: Hong, Guan Zhe, et al.
Published: (2025)
Limits of Transformer Language Models on Learning to Compose Algorithms
by: Thomm, Jonathan, et al.
Published: (2024)
by: Thomm, Jonathan, et al.
Published: (2024)
TransformLLM: Adapting Large Language Models via LLM-Transformed Reading Comprehension Text
by: Arbel, Iftach, et al.
Published: (2024)
by: Arbel, Iftach, et al.
Published: (2024)
Similar Items
-
Attention Sinks as Internal Signals for Hallucination Detection in Large Language Models
by: Binkowski, Jakub, et al.
Published: (2026) -
HiGPT: Heterogeneous Graph Language Model
by: Tang, Jiabin, et al.
Published: (2024) -
HMoE: Heterogeneous Mixture of Experts for Language Modeling
by: Wang, An, et al.
Published: (2024) -
Aligning language models with human preferences
by: Korbak, Tomasz
Published: (2024) -
Large Language Model-driven Meta-structure Discovery in Heterogeneous Information Network
by: Chen, Lin, et al.
Published: (2024)