Saved in:
| Main Authors: | Wickramasinghe, Kasun, de Silva, Nisansa |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.13040 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language
by: Wickramasinghe, Kasun, et al.
Published: (2023)
by: Wickramasinghe, Kasun, et al.
Published: (2023)
Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches
by: De Mel, Yomal, et al.
Published: (2024)
by: De Mel, Yomal, et al.
Published: (2024)
Selecting Seed Words for Wordle using Character Statistics
by: de Silva, Nisansa
Published: (2022)
by: de Silva, Nisansa
Published: (2022)
Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
by: de Silva, Nisansa
Published: (2019)
by: de Silva, Nisansa
Published: (2019)
Zero-shot OCR Accuracy of Low-Resourced Languages: A Comparative Analysis on Sinhala and Tamil
by: Jayatilleke, Nevidu, et al.
Published: (2025)
by: Jayatilleke, Nevidu, et al.
Published: (2025)
SiDiaC: Sinhala Diachronic Corpus
by: Jayatilleke, Nevidu, et al.
Published: (2025)
by: Jayatilleke, Nevidu, et al.
Published: (2025)
Sinhala Physical Common Sense Reasoning Dataset for Global PIQA
by: de Silva, Nisansa, et al.
Published: (2026)
by: de Silva, Nisansa, et al.
Published: (2026)
Fine Tuning Named Entity Extraction Models for the Fantasy Domain
by: Sivaganeshan, Aravinth, et al.
Published: (2024)
by: Sivaganeshan, Aravinth, et al.
Published: (2024)
Linguistic Analysis of Sinhala YouTube Comments on Sinhala Music Videos: A Dataset Study
by: De Mel, W. M. Yomal, et al.
Published: (2025)
by: De Mel, W. M. Yomal, et al.
Published: (2025)
SHADE: Semantic Hypernym Annotator for Domain-specific Entities -- DnD Domain Use Case
by: Peiris, Akila, et al.
Published: (2024)
by: Peiris, Akila, et al.
Published: (2024)
GeeSanBhava: Sentiment Tagged Sinhala Music Video Comment Data Set
by: De Mel, Yomal, et al.
Published: (2025)
by: De Mel, Yomal, et al.
Published: (2025)
DESS: DeBERTa Enhanced Syntactic-Semantic Aspect Sentiment Triplet Extraction
by: Thenuwara, Vishal, et al.
Published: (2025)
by: Thenuwara, Vishal, et al.
Published: (2025)
M2DS: Multilingual Dataset for Multi-document Summarisation
by: Hewapathirana, Kushan, et al.
Published: (2024)
by: Hewapathirana, Kushan, et al.
Published: (2024)
Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
by: Ranathunga, Surangika, et al.
Published: (2024)
by: Ranathunga, Surangika, et al.
Published: (2024)
Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora
by: Ranathunga, Surangika, et al.
Published: (2024)
by: Ranathunga, Surangika, et al.
Published: (2024)
Improving the quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
by: Fernando, Aloka, et al.
Published: (2025)
by: Fernando, Aloka, et al.
Published: (2025)
LMSpell: Neural Spell Checking for Low-Resource Languages
by: Gunathilake, Akesh, et al.
Published: (2025)
by: Gunathilake, Akesh, et al.
Published: (2025)
Enhancing Cross-lingual Sentence Embedding for Low-resource Languages with Word Alignment
by: Miao, Zhongtao, et al.
Published: (2024)
by: Miao, Zhongtao, et al.
Published: (2024)
The Greatest Good Benchmark: Measuring LLMs' Alignment with Utilitarian Moral Dilemmas
by: Marraffini, Giovanni Franco Gabriel, et al.
Published: (2025)
by: Marraffini, Giovanni Franco Gabriel, et al.
Published: (2025)
Locally Measuring Cross-lingual Lexical Alignment: A Domain and Word Level Perspective
by: Karidi, Taelin, et al.
Published: (2024)
by: Karidi, Taelin, et al.
Published: (2024)
Utilizing Multilingual Encoders to Improve Large Language Models for Low-Resource Languages
by: Puranegedara, Imalsha, et al.
Published: (2025)
by: Puranegedara, Imalsha, et al.
Published: (2025)
SiDiaC-v.2.0: Sinhala Diachronic Corpus Version 2.0
by: Jayatilleke, Nevidu, et al.
Published: (2026)
by: Jayatilleke, Nevidu, et al.
Published: (2026)
Correspondence Analysis and PMI-Based Word Embeddings: A Comparative Study
by: Qi, Qianqian, et al.
Published: (2024)
by: Qi, Qianqian, et al.
Published: (2024)
SinLlama -- A Large Language Model for Sinhala
by: Aravinda, H. W. K., et al.
Published: (2025)
by: Aravinda, H. W. K., et al.
Published: (2025)
One Word Is Not Enough: Simple Prompts Improve Word Embeddings
by: Ranjan, Rajeev
Published: (2025)
by: Ranjan, Rajeev
Published: (2025)
Aspect-Based Sentiment Analysis Techniques: A Comparative Study
by: Jayakody, Dineth, et al.
Published: (2024)
by: Jayakody, Dineth, et al.
Published: (2024)
Revisiting Word Embeddings in the LLM Era
by: Mahajan, Yash, et al.
Published: (2025)
by: Mahajan, Yash, et al.
Published: (2025)
Evaluating Metrics for Bias in Word Embeddings
by: Schröder, Sarah, et al.
Published: (2021)
by: Schröder, Sarah, et al.
Published: (2021)
An Exploratory Analysis on the Explanatory Potential of Embedding-Based Measures of Semantic Transparency for Malay Word Recognition
by: Mohamed, M. Maziyah, et al.
Published: (2025)
by: Mohamed, M. Maziyah, et al.
Published: (2025)
Exploring Interpretability of Independent Components of Word Embeddings with Automated Word Intruder Test
by: Musil, Tomáš, et al.
Published: (2022)
by: Musil, Tomáš, et al.
Published: (2022)
Analyzing Correlations Between Intrinsic and Extrinsic Bias Metrics of Static Word Embeddings With Their Measuring Biases Aligned
by: Katô, Taisei, et al.
Published: (2024)
by: Katô, Taisei, et al.
Published: (2024)
Subword Tokenization Strategies for Kurdish Word Embeddings
by: Salehi, Ali, et al.
Published: (2025)
by: Salehi, Ali, et al.
Published: (2025)
PWESuite: Phonetic Word Embeddings and Tasks They Facilitate
by: Zouhar, Vilém, et al.
Published: (2023)
by: Zouhar, Vilém, et al.
Published: (2023)
Word Alignment as Preference for Machine Translation
by: Wu, Qiyu, et al.
Published: (2024)
by: Wu, Qiyu, et al.
Published: (2024)
GWPT: A Green Word-Embedding-based POS Tagger
by: Wei, Chengwei, et al.
Published: (2024)
by: Wei, Chengwei, et al.
Published: (2024)
Word Alignment-Based Evaluation of Uniform Meaning Representations
by: Zeman, Daniel, et al.
Published: (2026)
by: Zeman, Daniel, et al.
Published: (2026)
How to Compute the Probability of a Word
by: Pimentel, Tiago, et al.
Published: (2024)
by: Pimentel, Tiago, et al.
Published: (2024)
Learning Complex Word Embeddings in Classical and Quantum Spaces
by: Harvey, Carys, et al.
Published: (2024)
by: Harvey, Carys, et al.
Published: (2024)
Statistical Uncertainty in Word Embeddings: GloVe-V
by: Vallebueno, Andrea, et al.
Published: (2024)
by: Vallebueno, Andrea, et al.
Published: (2024)
A Comprehensive Empirical Evaluation of Existing Word Embedding Approaches
by: Zaland, Obaidullah, et al.
Published: (2023)
by: Zaland, Obaidullah, et al.
Published: (2023)
Similar Items
-
Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language
by: Wickramasinghe, Kasun, et al.
Published: (2023) -
Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches
by: De Mel, Yomal, et al.
Published: (2024) -
Selecting Seed Words for Wordle using Character Statistics
by: de Silva, Nisansa
Published: (2022) -
Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
by: de Silva, Nisansa
Published: (2019) -
Zero-shot OCR Accuracy of Low-Resourced Languages: A Comparative Analysis on Sinhala and Tamil
by: Jayatilleke, Nevidu, et al.
Published: (2025)