Saved in:
| Main Authors: | Jantsch, Lasse Marten, Koh, Dong-Jae, Lee, Seonghyeon, Suh, Young-Kyoon |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.19742 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Depth Registers Unlock W4A4 on SwiGLU: A Reader/Generator Decomposition
by: Liu, Ziyang
Published: (2026)
by: Liu, Ziyang
Published: (2026)
AudioMAE++: learning better masked audio representations with SwiGLU FFNs
by: Yadav, Sarthak, et al.
Published: (2025)
by: Yadav, Sarthak, et al.
Published: (2025)
LoLA-SpecViT: Local Attention SwiGLU Vision Transformer with LoRA for Hyperspectral Imaging
by: Zidi, Fadi Abdeladhim, et al.
Published: (2025)
by: Zidi, Fadi Abdeladhim, et al.
Published: (2025)
Symmetry-Compatible Principle for Optimizer Design: Embeddings, LM Heads, SwiGLU MLPs, and MoE Routers
by: Lau, Tim Tsz-Kit, et al.
Published: (2026)
by: Lau, Tim Tsz-Kit, et al.
Published: (2026)
Hidden Heroes and Gradient Bloats: Layer-Wise Redundancy Inverts Attribution in Transformers
by: Ye, Donald
Published: (2026)
by: Ye, Donald
Published: (2026)
GLU Attention Improve Transformer
by: Wang, Zehao
Published: (2025)
by: Wang, Zehao
Published: (2025)
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
by: Achtibat, Reduan, et al.
Published: (2024)
by: Achtibat, Reduan, et al.
Published: (2024)
Table Transformers for Imputing Textual Attributes
by: Wei, Ting-Ruen, et al.
Published: (2024)
by: Wei, Ting-Ruen, et al.
Published: (2024)
GLUScope: A Tool for Analyzing GLU Neurons in Transformer Language Models
by: Gerstner, Sebastian, et al.
Published: (2026)
by: Gerstner, Sebastian, et al.
Published: (2026)
Explanation Regularisation through the Lens of Attributions
by: Ferreira, Pedro, et al.
Published: (2024)
by: Ferreira, Pedro, et al.
Published: (2024)
Reliable, Adaptable, and Attributable Language Models with Retrieval
by: Asai, Akari, et al.
Published: (2024)
by: Asai, Akari, et al.
Published: (2024)
SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs
by: Shi, Dachuan, et al.
Published: (2025)
by: Shi, Dachuan, et al.
Published: (2025)
Approximate Attributions for Off-the-Shelf Siamese Transformers
by: Möller, Lucas, et al.
Published: (2024)
by: Möller, Lucas, et al.
Published: (2024)
Dual Perspectives in Emotion Attribution: A Generator-Interpreter Framework for Cross-Cultural Analysis of Emotion in LLMs
by: Turdubaeva, Aizirek, et al.
Published: (2026)
by: Turdubaeva, Aizirek, et al.
Published: (2026)
Attribution functionalism
by: Mark Phelan
Published: (2025)
by: Mark Phelan
Published: (2025)
Efficient Text-Attributed Graph Learning through Selective Annotation and Graph Alignment
by: Xie, Huanyi, et al.
Published: (2025)
by: Xie, Huanyi, et al.
Published: (2025)
SwiLTra-Bench: The Swiss Legal Translation Benchmark
by: Niklaus, Joel, et al.
Published: (2025)
by: Niklaus, Joel, et al.
Published: (2025)
Adaptive Layer Selection for Layer-Wise Token Pruning in LLM Inference
by: Taniguchi, Rei, et al.
Published: (2026)
by: Taniguchi, Rei, et al.
Published: (2026)
Multi-Attribute Steering of Language Models via Targeted Intervention
by: Nguyen, Duy, et al.
Published: (2025)
by: Nguyen, Duy, et al.
Published: (2025)
AttributionBench: How Hard is Automatic Attribution Evaluation?
by: Li, Yifei, et al.
Published: (2024)
by: Li, Yifei, et al.
Published: (2024)
LLMCache: Layer-Wise Caching Strategies for Accelerated Reuse in Transformer Inference
by: Bansal, Harsh Vardhan
Published: (2025)
by: Bansal, Harsh Vardhan
Published: (2025)
Advancing Attribution-Based Neural Network Explainability through Relative Absolute Magnitude Layer-Wise Relevance Propagation and Multi-Component Evaluation
by: Vukadin, Davor, et al.
Published: (2024)
by: Vukadin, Davor, et al.
Published: (2024)
Prune, Interpret, Evaluate: A Cross-Layer Transcoder-Native Framework for Efficient Circuit Discovery via Feature Attribution
by: Chen, Qinhao, et al.
Published: (2026)
by: Chen, Qinhao, et al.
Published: (2026)
VISTA: Visualization of Token Attribution via Efficient Analysis
by: Ahmed, Syed, et al.
Published: (2026)
by: Ahmed, Syed, et al.
Published: (2026)
KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark
by: Jang, Seongbo, et al.
Published: (2024)
by: Jang, Seongbo, et al.
Published: (2024)
CLT-Forge: A Scalable Library for Cross-Layer Transcoders and Attribution Graphs
by: Draye, Florent, et al.
Published: (2026)
by: Draye, Florent, et al.
Published: (2026)
Learning to Attribute with Attention
by: Cohen-Wang, Benjamin, et al.
Published: (2025)
by: Cohen-Wang, Benjamin, et al.
Published: (2025)
Advancing Large Language Model Attribution through Self-Improving
by: Huang, Lei, et al.
Published: (2024)
by: Huang, Lei, et al.
Published: (2024)
Bias Analysis and Mitigation through Protected Attribute Detection and Regard Classification
by: Udagawa, Takuma, et al.
Published: (2025)
by: Udagawa, Takuma, et al.
Published: (2025)
Fine-grained Analysis of Brain-LLM Alignment through Input Attribution
by: Proietti, Michela, et al.
Published: (2025)
by: Proietti, Michela, et al.
Published: (2025)
One Arrow, Many Targets: Probing LLMs for Multi-Attribute Controllable Text Summarization
by: Roy, Tathagato, et al.
Published: (2024)
by: Roy, Tathagato, et al.
Published: (2024)
Efficient Estimation of Kernel Surrogate Models for Task Attribution
by: Zhang, Zhenshuo, et al.
Published: (2026)
by: Zhang, Zhenshuo, et al.
Published: (2026)
Evaluating the Smooth Control of Attribute Intensity in Text Generation with LLMs
by: Zhou, Shang, et al.
Published: (2024)
by: Zhou, Shang, et al.
Published: (2024)
Learning to Explain: Supervised Token Attribution from Transformer Attention Patterns
by: Mihaila, George
Published: (2026)
by: Mihaila, George
Published: (2026)
Mitigating Quantization Errors Due to Activation Spikes in GLU-Based LLMs
by: Yang, Jaewoo, et al.
Published: (2024)
by: Yang, Jaewoo, et al.
Published: (2024)
On the Effectiveness of Integration Methods for Multimodal Dialogue Response Retrieval
by: Jang, Seongbo, et al.
Published: (2025)
by: Jang, Seongbo, et al.
Published: (2025)
Improving Attributed Long-form Question Answering with Intent Awareness
by: Zhao, Xinran, et al.
Published: (2026)
by: Zhao, Xinran, et al.
Published: (2026)
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
by: Song, Maojia, et al.
Published: (2024)
by: Song, Maojia, et al.
Published: (2024)
MentalAgora: A Gateway to Advanced Personalized Care in Mental Health through Multi-Agent Debating and Attribute Control
by: Lee, Yeonji, et al.
Published: (2024)
by: Lee, Yeonji, et al.
Published: (2024)
ExpertQA: Expert-Curated Questions and Attributed Answers
by: Malaviya, Chaitanya, et al.
Published: (2023)
by: Malaviya, Chaitanya, et al.
Published: (2023)
Similar Items
-
Depth Registers Unlock W4A4 on SwiGLU: A Reader/Generator Decomposition
by: Liu, Ziyang
Published: (2026) -
AudioMAE++: learning better masked audio representations with SwiGLU FFNs
by: Yadav, Sarthak, et al.
Published: (2025) -
LoLA-SpecViT: Local Attention SwiGLU Vision Transformer with LoRA for Hyperspectral Imaging
by: Zidi, Fadi Abdeladhim, et al.
Published: (2025) -
Symmetry-Compatible Principle for Optimizer Design: Embeddings, LM Heads, SwiGLU MLPs, and MoE Routers
by: Lau, Tim Tsz-Kit, et al.
Published: (2026) -
Hidden Heroes and Gradient Bloats: Layer-Wise Redundancy Inverts Attribution in Transformers
by: Ye, Donald
Published: (2026)