:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wickramasinghe, Kasun, de Silva, Nisansa
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2511.13040
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language
by: Wickramasinghe, Kasun, et al.
Published: (2023)

Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches
by: De Mel, Yomal, et al.
Published: (2024)

Selecting Seed Words for Wordle using Character Statistics
by: de Silva, Nisansa
Published: (2022)

Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
by: de Silva, Nisansa
Published: (2019)

Zero-shot OCR Accuracy of Low-Resourced Languages: A Comparative Analysis on Sinhala and Tamil
by: Jayatilleke, Nevidu, et al.
Published: (2025)

SiDiaC: Sinhala Diachronic Corpus
by: Jayatilleke, Nevidu, et al.
Published: (2025)

Sinhala Physical Common Sense Reasoning Dataset for Global PIQA
by: de Silva, Nisansa, et al.
Published: (2026)

Fine Tuning Named Entity Extraction Models for the Fantasy Domain
by: Sivaganeshan, Aravinth, et al.
Published: (2024)

Linguistic Analysis of Sinhala YouTube Comments on Sinhala Music Videos: A Dataset Study
by: De Mel, W. M. Yomal, et al.
Published: (2025)

SHADE: Semantic Hypernym Annotator for Domain-specific Entities -- DnD Domain Use Case
by: Peiris, Akila, et al.
Published: (2024)

GeeSanBhava: Sentiment Tagged Sinhala Music Video Comment Data Set
by: De Mel, Yomal, et al.
Published: (2025)

DESS: DeBERTa Enhanced Syntactic-Semantic Aspect Sentiment Triplet Extraction
by: Thenuwara, Vishal, et al.
Published: (2025)

M2DS: Multilingual Dataset for Multi-document Summarisation
by: Hewapathirana, Kushan, et al.
Published: (2024)

Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
by: Ranathunga, Surangika, et al.
Published: (2024)

Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora
by: Ranathunga, Surangika, et al.
Published: (2024)

Improving the quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
by: Fernando, Aloka, et al.
Published: (2025)

LMSpell: Neural Spell Checking for Low-Resource Languages
by: Gunathilake, Akesh, et al.
Published: (2025)

Enhancing Cross-lingual Sentence Embedding for Low-resource Languages with Word Alignment
by: Miao, Zhongtao, et al.
Published: (2024)

The Greatest Good Benchmark: Measuring LLMs' Alignment with Utilitarian Moral Dilemmas
by: Marraffini, Giovanni Franco Gabriel, et al.
Published: (2025)

Locally Measuring Cross-lingual Lexical Alignment: A Domain and Word Level Perspective
by: Karidi, Taelin, et al.
Published: (2024)

Utilizing Multilingual Encoders to Improve Large Language Models for Low-Resource Languages
by: Puranegedara, Imalsha, et al.
Published: (2025)

SiDiaC-v.2.0: Sinhala Diachronic Corpus Version 2.0
by: Jayatilleke, Nevidu, et al.
Published: (2026)

Correspondence Analysis and PMI-Based Word Embeddings: A Comparative Study
by: Qi, Qianqian, et al.
Published: (2024)

SinLlama -- A Large Language Model for Sinhala
by: Aravinda, H. W. K., et al.
Published: (2025)

One Word Is Not Enough: Simple Prompts Improve Word Embeddings
by: Ranjan, Rajeev
Published: (2025)

Aspect-Based Sentiment Analysis Techniques: A Comparative Study
by: Jayakody, Dineth, et al.
Published: (2024)

Revisiting Word Embeddings in the LLM Era
by: Mahajan, Yash, et al.
Published: (2025)

Evaluating Metrics for Bias in Word Embeddings
by: Schröder, Sarah, et al.
Published: (2021)

An Exploratory Analysis on the Explanatory Potential of Embedding-Based Measures of Semantic Transparency for Malay Word Recognition
by: Mohamed, M. Maziyah, et al.
Published: (2025)

Exploring Interpretability of Independent Components of Word Embeddings with Automated Word Intruder Test
by: Musil, Tomáš, et al.
Published: (2022)

Analyzing Correlations Between Intrinsic and Extrinsic Bias Metrics of Static Word Embeddings With Their Measuring Biases Aligned
by: Katô, Taisei, et al.
Published: (2024)

Subword Tokenization Strategies for Kurdish Word Embeddings
by: Salehi, Ali, et al.
Published: (2025)

PWESuite: Phonetic Word Embeddings and Tasks They Facilitate
by: Zouhar, Vilém, et al.
Published: (2023)

Word Alignment as Preference for Machine Translation
by: Wu, Qiyu, et al.
Published: (2024)

GWPT: A Green Word-Embedding-based POS Tagger
by: Wei, Chengwei, et al.
Published: (2024)

Word Alignment-Based Evaluation of Uniform Meaning Representations
by: Zeman, Daniel, et al.
Published: (2026)

How to Compute the Probability of a Word
by: Pimentel, Tiago, et al.
Published: (2024)

Learning Complex Word Embeddings in Classical and Quantum Spaces
by: Harvey, Carys, et al.
Published: (2024)

Statistical Uncertainty in Word Embeddings: GloVe-V
by: Vallebueno, Andrea, et al.
Published: (2024)

A Comprehensive Empirical Evaluation of Existing Word Embedding Approaches
by: Zaland, Obaidullah, et al.
Published: (2023)