:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	de Silva, Nisansa
Format:	Preprint
Published:	2022
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2202.03457
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

How Good is BLI as an Alignment Measure: A Study in Word Embedding Paradigm
by: Wickramasinghe, Kasun, et al.
Published: (2025)

Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language
by: Wickramasinghe, Kasun, et al.
Published: (2023)

Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
by: de Silva, Nisansa
Published: (2019)

SiDiaC: Sinhala Diachronic Corpus
by: Jayatilleke, Nevidu, et al.
Published: (2025)

Sinhala Physical Common Sense Reasoning Dataset for Global PIQA
by: de Silva, Nisansa, et al.
Published: (2026)

Zero-shot OCR Accuracy of Low-Resourced Languages: A Comparative Analysis on Sinhala and Tamil
by: Jayatilleke, Nevidu, et al.
Published: (2025)

Fine Tuning Named Entity Extraction Models for the Fantasy Domain
by: Sivaganeshan, Aravinth, et al.
Published: (2024)

SHADE: Semantic Hypernym Annotator for Domain-specific Entities -- DnD Domain Use Case
by: Peiris, Akila, et al.
Published: (2024)

Automatically Detecting Amusing Games in Wordle
by: Luo, Ronaldo, et al.
Published: (2025)

GeeSanBhava: Sentiment Tagged Sinhala Music Video Comment Data Set
by: De Mel, Yomal, et al.
Published: (2025)

DESS: DeBERTa Enhanced Syntactic-Semantic Aspect Sentiment Triplet Extraction
by: Thenuwara, Vishal, et al.
Published: (2025)

Semantic, Orthographic, and Phonological Biases in Humans' Wordle Gameplay
by: Liang, Jiadong, et al.
Published: (2024)

Linguistic Analysis of Sinhala YouTube Comments on Sinhala Music Videos: A Dataset Study
by: De Mel, W. M. Yomal, et al.
Published: (2025)

M2DS: Multilingual Dataset for Multi-document Summarisation
by: Hewapathirana, Kushan, et al.
Published: (2024)

Improving the quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
by: Fernando, Aloka, et al.
Published: (2025)

Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
by: Ranathunga, Surangika, et al.
Published: (2024)

LMSpell: Neural Spell Checking for Low-Resource Languages
by: Gunathilake, Akesh, et al.
Published: (2025)

Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora
by: Ranathunga, Surangika, et al.
Published: (2024)

Large Language Models Lack Understanding of Character Composition of Words
by: Shin, Andrew, et al.
Published: (2024)

Chinese Word Boundary Recovery through Character Alignment Projection
by: Wang, Lusha, et al.
Published: (2026)

SiDiaC-v.2.0: Sinhala Diachronic Corpus Version 2.0
by: Jayatilleke, Nevidu, et al.
Published: (2026)

Utilizing Multilingual Encoders to Improve Large Language Models for Low-Resource Languages
by: Puranegedara, Imalsha, et al.
Published: (2025)

Word Recovery in Large Language Models Enables Character-Level Tokenization Robustness
by: Yang, Zhipeng, et al.
Published: (2026)

SinLlama -- A Large Language Model for Sinhala
by: Aravinda, H. W. K., et al.
Published: (2025)

Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches
by: De Mel, Yomal, et al.
Published: (2024)

Character-Level Chinese Dependency Parsing via Modeling Latent Intra-Word Structure
by: Hou, Yang, et al.
Published: (2024)

Accurate and Efficient Statistical Testing for Word Semantic Breadth
by: Ehara, Yo
Published: (2026)

Statistical Uncertainty in Word Embeddings: GloVe-V
by: Vallebueno, Andrea, et al.
Published: (2024)

Function Words as Statistical Cues for Language Learning
by: Yang, Xiulin, et al.
Published: (2026)

Constraint Satisfaction Approaches to Wordle: Novel Heuristics and Cross-Lexicon Validation
by: Arafat, Jahidul, et al.
Published: (2025)

Aspect-Based Sentiment Analysis Techniques: A Comparative Study
by: Jayakody, Dineth, et al.
Published: (2024)

Evaluating Computational Representations of Character: An Austen Character Similarity Benchmark
by: Yang, Funing, et al.
Published: (2024)

CharacterBench: Benchmarking Character Customization of Large Language Models
by: Zhou, Jinfeng, et al.
Published: (2024)

Word Embedding Dimension Reduction via Weakly-Supervised Feature Selection
by: Xue, Jintang, et al.
Published: (2024)

Explaining Datasets in Words: Statistical Models with Natural Language Parameters
by: Zhong, Ruiqi, et al.
Published: (2024)

C-LLM: Learn to Check Chinese Spelling Errors Character by Character
by: Li, Kunting, et al.
Published: (2024)

SLoW: Select Low-frequency Words! Automatic Dictionary Selection for Translation on Large Language Models
by: Lu, Hongyuan, et al.
Published: (2025)

Puzzle Game: Prediction and Classification of Wordle Solution Words
by: Xin, Haidong, et al.
Published: (2024)

Going Beyond Word Matching: Syntax Improves In-context Example Selection for Machine Translation
by: Tang, Chenming, et al.
Published: (2024)

Evaluating Character Understanding of Large Language Models via Character Profiling from Fictional Works
by: Yuan, Xinfeng, et al.
Published: (2024)