Saved in:
| Main Authors: | Basoz, Merve, Horne, Andrew, Opper, Mattia |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.01732 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
StrAE: Autoencoding for Pre-Trained Embeddings using Explicit Structure
by: Opper, Mattia, et al.
Published: (2023)
by: Opper, Mattia, et al.
Published: (2023)
Banyan: Improved Representation Learning with Explicit Structure
by: Opper, Mattia, et al.
Published: (2024)
by: Opper, Mattia, et al.
Published: (2024)
Self-StrAE at SemEval-2024 Task 1: Making Self-Structuring AutoEncoders Learn More With Less
by: Opper, Mattia, et al.
Published: (2024)
by: Opper, Mattia, et al.
Published: (2024)
LLMs for Translation: Historical, Low-Resourced Languages and Contemporary AI Models
by: Tekgurler, Merve
Published: (2025)
by: Tekgurler, Merve
Published: (2025)
TRA: Better Length Generalisation with Threshold Relative Attention
by: Opper, Mattia, et al.
Published: (2025)
by: Opper, Mattia, et al.
Published: (2025)
Compositional Generalization Across Distributional Shifts with Sparse Tree Operations
by: Soulos, Paul, et al.
Published: (2024)
by: Soulos, Paul, et al.
Published: (2024)
NagaNLP: Bootstrapping NLP for Low-Resource Nagamese Creole with Human-in-the-Loop Synthetic Data
by: Maiti, Agniva, et al.
Published: (2025)
by: Maiti, Agniva, et al.
Published: (2025)
TharuChat: Bootstrapping Large Language Models for a Low-Resource Language via Synthetic Data and Human Validation
by: Panth, Prajwal, et al.
Published: (2026)
by: Panth, Prajwal, et al.
Published: (2026)
Mechanisms of Symbol Processing for In-Context Learning in Transformer Networks
by: Smolensky, Paul, et al.
Published: (2024)
by: Smolensky, Paul, et al.
Published: (2024)
LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation
by: Teklehaymanot, Hailay, et al.
Published: (2026)
by: Teklehaymanot, Hailay, et al.
Published: (2026)
Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language
by: Wickramasinghe, Kasun, et al.
Published: (2023)
by: Wickramasinghe, Kasun, et al.
Published: (2023)
BanglaEmbed: Efficient Sentence Embedding Models for a Low-Resource Language Using Cross-Lingual Distillation Techniques
by: Kabir, Muhammad Rafsan, et al.
Published: (2024)
by: Kabir, Muhammad Rafsan, et al.
Published: (2024)
The Zeno's Paradox of `Low-Resource' Languages
by: Nigatu, Hellina Hailu, et al.
Published: (2024)
by: Nigatu, Hellina Hailu, et al.
Published: (2024)
BAGEL: Bootstrapping Agents by Guiding Exploration with Language
by: Murty, Shikhar, et al.
Published: (2024)
by: Murty, Shikhar, et al.
Published: (2024)
SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings
by: Lu, Weikai, et al.
Published: (2025)
by: Lu, Weikai, et al.
Published: (2025)
Less is More: Adapting Text Embeddings for Low-Resource Languages with Small Scale Noisy Synthetic Data
by: Navasardyan, Zaruhi, et al.
Published: (2026)
by: Navasardyan, Zaruhi, et al.
Published: (2026)
GlotLID: Language Identification for Low-Resource Languages
by: Kargaran, Amir Hossein, et al.
Published: (2023)
by: Kargaran, Amir Hossein, et al.
Published: (2023)
Bootstrapping Fuzzers for Compilers of Low-Resource Language Dialects Using Language Models
by: Vaidya, Sairam, et al.
Published: (2025)
by: Vaidya, Sairam, et al.
Published: (2025)
GrEmLIn: A Repository of Green Baseline Embeddings for 87 Low-Resource Languages Injected with Multilingual Graph Knowledge
by: Gurgurov, Daniil, et al.
Published: (2024)
by: Gurgurov, Daniil, et al.
Published: (2024)
Reducing Tokenization Premiums for Low-Resource Languages
by: Churchill, Geoffrey, et al.
Published: (2026)
by: Churchill, Geoffrey, et al.
Published: (2026)
Investigating Hallucination in Conversations for Low Resource Languages
by: Das, Amit, et al.
Published: (2025)
by: Das, Amit, et al.
Published: (2025)
On Multilingual Encoder Language Model Compression for Low-Resource Languages
by: Gurgurov, Daniil, et al.
Published: (2025)
by: Gurgurov, Daniil, et al.
Published: (2025)
Evaluation of Chunking Strategies for Effective Text Embedding in Low-Resource Language on Agricultural Documents
by: Chhoun, Sovandara, et al.
Published: (2026)
by: Chhoun, Sovandara, et al.
Published: (2026)
LLM Probe: Evaluating LLMs for Low-Resource Languages
by: Teklehaymanot, Hailay Kidu, et al.
Published: (2026)
by: Teklehaymanot, Hailay Kidu, et al.
Published: (2026)
Task Arithmetic with Support Languages for Low-Resource ASR
by: Rafkin, Emma, et al.
Published: (2026)
by: Rafkin, Emma, et al.
Published: (2026)
LMSpell: Neural Spell Checking for Low-Resource Languages
by: Gunathilake, Akesh, et al.
Published: (2025)
by: Gunathilake, Akesh, et al.
Published: (2025)
Massively Multilingual Text Translation For Low-Resource Languages
by: Zhou, Zhong
Published: (2024)
by: Zhou, Zhong
Published: (2024)
LLMs for Extremely Low-Resource Finno-Ugric Languages
by: Purason, Taido, et al.
Published: (2024)
by: Purason, Taido, et al.
Published: (2024)
Unsupervised Bilingual Lexicon Induction for Low Resource Languages
by: Rathnayake, Charitha, et al.
Published: (2024)
by: Rathnayake, Charitha, et al.
Published: (2024)
Large Language Models for Toxic Language Detection in Low-Resource Balkan Languages
by: Muminovic, Amel, et al.
Published: (2025)
by: Muminovic, Amel, et al.
Published: (2025)
Bootstrapping Language Models with DPO Implicit Rewards
by: Chen, Changyu, et al.
Published: (2024)
by: Chen, Changyu, et al.
Published: (2024)
Performance of Recent Large Language Models for a Low-Resourced Language
by: Jayakody, Ravindu, et al.
Published: (2024)
by: Jayakody, Ravindu, et al.
Published: (2024)
AI Diffusion in Low Resource Language Countries
by: Misra, Amit, et al.
Published: (2025)
by: Misra, Amit, et al.
Published: (2025)
LINGOLY: A Benchmark of Olympiad-Level Linguistic Reasoning Puzzles in Low-Resource and Extinct Languages
by: Bean, Andrew M., et al.
Published: (2024)
by: Bean, Andrew M., et al.
Published: (2024)
CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition
by: Semnani, Sina J., et al.
Published: (2025)
by: Semnani, Sina J., et al.
Published: (2025)
One Instruction Does Not Fit All: How Well Do Embeddings Align Personas and Instructions in Low-Resource Indian Languages?
by: Shah, Arya, et al.
Published: (2026)
by: Shah, Arya, et al.
Published: (2026)
On Limitations of LLM as Annotator for Low Resource Languages
by: Jadhav, Suramya, et al.
Published: (2024)
by: Jadhav, Suramya, et al.
Published: (2024)
Transformers for Low-Resource Languages: Is Féidir Linn!
by: Lankford, Séamus, et al.
Published: (2024)
by: Lankford, Séamus, et al.
Published: (2024)
Utilizing Multilingual Encoders to Improve Large Language Models for Low-Resource Languages
by: Puranegedara, Imalsha, et al.
Published: (2025)
by: Puranegedara, Imalsha, et al.
Published: (2025)
Is Small Language Model the Silver Bullet to Low-Resource Languages Machine Translation?
by: Song, Yewei, et al.
Published: (2025)
by: Song, Yewei, et al.
Published: (2025)
Similar Items
-
StrAE: Autoencoding for Pre-Trained Embeddings using Explicit Structure
by: Opper, Mattia, et al.
Published: (2023) -
Banyan: Improved Representation Learning with Explicit Structure
by: Opper, Mattia, et al.
Published: (2024) -
Self-StrAE at SemEval-2024 Task 1: Making Self-Structuring AutoEncoders Learn More With Less
by: Opper, Mattia, et al.
Published: (2024) -
LLMs for Translation: Historical, Low-Resourced Languages and Contemporary AI Models
by: Tekgurler, Merve
Published: (2025) -
TRA: Better Length Generalisation with Threshold Relative Attention
by: Opper, Mattia, et al.
Published: (2025)