Saved in:
| Main Authors: | Arnold, Stefan, Fietta, Marian, Yesilbas, Dilara |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.14107 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Memorization in Language Models through the Lens of Intrinsic Dimension
by: Arnold, Stefan
Published: (2025)
by: Arnold, Stefan
Published: (2025)
Steering Prepositional Phrases in Language Models: A Case of with-headed Adjectival and Adverbial Complements in Gemma-2
by: Arnold, Stefan, et al.
Published: (2025)
by: Arnold, Stefan, et al.
Published: (2025)
Documentation Practices of Artificial Intelligence
by: Arnold, Stefan, et al.
Published: (2024)
by: Arnold, Stefan, et al.
Published: (2024)
Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling
by: Pikuliak, Matúš, et al.
Published: (2023)
by: Pikuliak, Matúš, et al.
Published: (2023)
Differentially-Private Text Rewriting reshapes Linguistic Style
by: Arnold, Stefan
Published: (2026)
by: Arnold, Stefan
Published: (2026)
Inspecting the Representation Manifold of Differentially-Private Text
by: Arnold, Stefan
Published: (2025)
by: Arnold, Stefan
Published: (2025)
Lookahead Routing for Large Language Models
by: Huang, Canbin, et al.
Published: (2025)
by: Huang, Canbin, et al.
Published: (2025)
SparseD: Sparse Attention for Diffusion Language Models
by: Wang, Zeqing, et al.
Published: (2025)
by: Wang, Zeqing, et al.
Published: (2025)
LPC-SM: Local Predictive Coding and Sparse Memory for Long-Context Language Modeling
by: Xie, Keqin
Published: (2026)
by: Xie, Keqin
Published: (2026)
Soft Language Prompts for Language Transfer
by: Vykopal, Ivan, et al.
Published: (2024)
by: Vykopal, Ivan, et al.
Published: (2024)
Towards Enabling FAIR Dataspaces Using Large Language Models
by: Arnold, Benedikt T., et al.
Published: (2024)
by: Arnold, Benedikt T., et al.
Published: (2024)
Towards Generalizable Implicit In-Context Learning with Attention Routing
by: Li, Jiaqian, et al.
Published: (2025)
by: Li, Jiaqian, et al.
Published: (2025)
SPLA: Block Sparse Plus Linear Attention for Long Context Modeling
by: Wang, Bailin, et al.
Published: (2026)
by: Wang, Bailin, et al.
Published: (2026)
Generative Large Language Models in Automated Fact-Checking: A Survey
by: Vykopal, Ivan, et al.
Published: (2024)
by: Vykopal, Ivan, et al.
Published: (2024)
Building Efficient and Effective OpenQA Systems for Low-Resource Languages
by: Budur, Emrah, et al.
Published: (2024)
by: Budur, Emrah, et al.
Published: (2024)
Route Before Retrieve: Activating Latent Routing Abilities of LLMs for RAG vs. Long-Context Selection
by: Chen, Yiwen, et al.
Published: (2026)
by: Chen, Yiwen, et al.
Published: (2026)
Text-Routed Sparse Mixture-of-Experts Model with Explanation and Temporal Alignment for Multi-Modal Sentiment Analysis
by: Rao, Dongning, et al.
Published: (2025)
by: Rao, Dongning, et al.
Published: (2025)
UniBERT: Adversarial Training for Language-Universal Representations
by: Avram, Andrei-Marius, et al.
Published: (2025)
by: Avram, Andrei-Marius, et al.
Published: (2025)
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice Routing
by: Piękos, Piotr, et al.
Published: (2025)
by: Piękos, Piotr, et al.
Published: (2025)
Long-Context Language Modeling with Parallel Context Encoding
by: Yen, Howard, et al.
Published: (2024)
by: Yen, Howard, et al.
Published: (2024)
MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning
by: Shen, Jingyan, et al.
Published: (2025)
by: Shen, Jingyan, et al.
Published: (2025)
Sparse Reward Subsystem in Large Language Models
by: Xu, Guowei, et al.
Published: (2026)
by: Xu, Guowei, et al.
Published: (2026)
Understanding Refusal in Language Models with Sparse Autoencoders
by: Yeo, Wei Jie, et al.
Published: (2025)
by: Yeo, Wei Jie, et al.
Published: (2025)
Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models
by: Gurgurov, Daniil, et al.
Published: (2025)
by: Gurgurov, Daniil, et al.
Published: (2025)
Long-Context Generalization with Sparse Attention
by: Vasylenko, Pavlo, et al.
Published: (2025)
by: Vasylenko, Pavlo, et al.
Published: (2025)
Lag-Relative Sparse Attention In Long Context Training
by: Liang, Manlai, et al.
Published: (2025)
by: Liang, Manlai, et al.
Published: (2025)
$π$-Attention: Periodic Sparse Transformers for Efficient Long-Context Modeling
by: Liu, Dong, et al.
Published: (2025)
by: Liu, Dong, et al.
Published: (2025)
Routing Absorption in Sparse Attention: Why Random Gates Are Hard to Beat
by: Aquino-Michaels, Keston
Published: (2026)
by: Aquino-Michaels, Keston
Published: (2026)
On-Policy Context Distillation for Language Models
by: Ye, Tianzhu, et al.
Published: (2026)
by: Ye, Tianzhu, et al.
Published: (2026)
In-Context Watermarks for Large Language Models
by: Liu, Yepeng, et al.
Published: (2025)
by: Liu, Yepeng, et al.
Published: (2025)
SparseEval: Efficient Evaluation of Large Language Models by Sparse Optimization
by: Zhang, Taolin, et al.
Published: (2026)
by: Zhang, Taolin, et al.
Published: (2026)
Sparse Matrix in Large Language Model Fine-tuning
by: He, Haoze, et al.
Published: (2024)
by: He, Haoze, et al.
Published: (2024)
In-Context Former: Lightning-fast Compressing Context for Large Language Model
by: Wang, Xiangfeng, et al.
Published: (2024)
by: Wang, Xiangfeng, et al.
Published: (2024)
RAGRouter: Learning to Route Queries to Multiple Retrieval-Augmented Language Models
by: Zhang, Jiarui, et al.
Published: (2025)
by: Zhang, Jiarui, et al.
Published: (2025)
Adamas: Hadamard Sparse Attention for Efficient Long-Context Inference
by: Yan, Siyuan, et al.
Published: (2025)
by: Yan, Siyuan, et al.
Published: (2025)
Large Language Models for Multilingual Previously Fact-Checked Claim Detection
by: Vykopal, Ivan, et al.
Published: (2025)
by: Vykopal, Ivan, et al.
Published: (2025)
ARS: Automatic Routing Solver with Large Language Models
by: Li, Kai, et al.
Published: (2025)
by: Li, Kai, et al.
Published: (2025)
NestedKV: Nested Memory Routing for Long-Context KV Cache Compression
by: Chen, Hong, et al.
Published: (2026)
by: Chen, Hong, et al.
Published: (2026)
A Cross-Validation Study of Turkish Sentiment Analysis Datasets and Tools
by: Çakıcı, Şevval, et al.
Published: (2024)
by: Çakıcı, Şevval, et al.
Published: (2024)
TurnBack: A Geospatial Route Cognition Benchmark for Large Language Models through Reverse Route
by: Luo, Hongyi, et al.
Published: (2025)
by: Luo, Hongyi, et al.
Published: (2025)
Similar Items
-
Memorization in Language Models through the Lens of Intrinsic Dimension
by: Arnold, Stefan
Published: (2025) -
Steering Prepositional Phrases in Language Models: A Case of with-headed Adjectival and Adverbial Complements in Gemma-2
by: Arnold, Stefan, et al.
Published: (2025) -
Documentation Practices of Artificial Intelligence
by: Arnold, Stefan, et al.
Published: (2024) -
Women Are Beautiful, Men Are Leaders: Gender Stereotypes in Machine Translation and Language Modeling
by: Pikuliak, Matúš, et al.
Published: (2023) -
Differentially-Private Text Rewriting reshapes Linguistic Style
by: Arnold, Stefan
Published: (2026)