:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Gunathilake, Akesh, Karunarathna, Nadil, Bandaranayake, Tharusha, de Silva, Nisansa, Ranathunga, Surangika
Formato:	Preprint
Publicado:	2025
Materias:	Computation and Language
Acceso en línea:	https://arxiv.org/abs/2512.05414
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Sinhala Physical Common Sense Reasoning Dataset for Global PIQA
por: de Silva, Nisansa, et al.
Publicado: (2026)

Utilizing Multilingual Encoders to Improve Large Language Models for Low-Resource Languages
por: Puranegedara, Imalsha, et al.
Publicado: (2025)

Improving the quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
por: Fernando, Aloka, et al.
Publicado: (2025)

Linguistic Entity Masking to Improve Cross-Lingual Representation of Multilingual Language Models for Low-Resource Languages
por: Fernando, Aloka, et al.
Publicado: (2025)

Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
por: Ranathunga, Surangika, et al.
Publicado: (2024)

Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora
por: Ranathunga, Surangika, et al.
Publicado: (2024)

Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches
por: De Mel, Yomal, et al.
Publicado: (2024)

SinLlama -- A Large Language Model for Sinhala
por: Aravinda, H. W. K., et al.
Publicado: (2025)

Unsupervised Bilingual Lexicon Induction for Low Resource Languages
por: Rathnayake, Charitha, et al.
Publicado: (2024)

Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation
por: Su, Tong, et al.
Publicado: (2024)

Zero-shot OCR Accuracy of Low-Resourced Languages: A Comparative Analysis on Sinhala and Tamil
por: Jayatilleke, Nevidu, et al.
Publicado: (2025)

Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language
por: Wickramasinghe, Kasun, et al.
Publicado: (2023)

Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation
por: Thillainathan, Sarubi, et al.
Publicado: (2025)

Large Language Models for Ingredient Substitution in Food Recipes using Supervised Fine-tuning and Direct Preference Optimization
por: Senath, Thevin, et al.
Publicado: (2024)

Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
por: de Silva, Nisansa
Publicado: (2019)

Selecting Seed Words for Wordle using Character Statistics
por: de Silva, Nisansa
Publicado: (2022)

Extracting Disaster Impacts and Impact Related Locations in Social Media Posts Using Large Language Models
por: Hameed, Sameeah Noreen, et al.
Publicado: (2025)

A Multi-way Parallel Named Entity Annotated Corpus for English, Tamil and Sinhala
por: Ranathunga, Surangika, et al.
Publicado: (2024)

SiTSE: Sinhala Text Simplification Dataset and Evaluation
por: Ranathunga, Surangika, et al.
Publicado: (2024)

Error-Robust Retrieval for Chinese Spelling Check
por: Yin, Xunjian, et al.
Publicado: (2022)

SiDiaC: Sinhala Diachronic Corpus
por: Jayatilleke, Nevidu, et al.
Publicado: (2025)

How Good is BLI as an Alignment Measure: A Study in Word Embedding Paradigm
por: Wickramasinghe, Kasun, et al.
Publicado: (2025)

Fine Tuning Named Entity Extraction Models for the Fantasy Domain
por: Sivaganeshan, Aravinth, et al.
Publicado: (2024)

Mixture of Small and Large Models for Chinese Spelling Check
por: Qiao, Ziheng, et al.
Publicado: (2025)

SHADE: Semantic Hypernym Annotator for Domain-specific Entities -- DnD Domain Use Case
por: Peiris, Akila, et al.
Publicado: (2024)

GeeSanBhava: Sentiment Tagged Sinhala Music Video Comment Data Set
por: De Mel, Yomal, et al.
Publicado: (2025)

DESS: DeBERTa Enhanced Syntactic-Semantic Aspect Sentiment Triplet Extraction
por: Thenuwara, Vishal, et al.
Publicado: (2025)

A Framework to Assess Multilingual Vulnerabilities of LLMs
por: Tang, Likai, et al.
Publicado: (2025)

Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning
por: Susnjak, Teo, et al.
Publicado: (2024)

Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking
por: Dong, Ming, et al.
Publicado: (2024)

C-LLM: Learn to Check Chinese Spelling Errors Character by Character
por: Li, Kunting, et al.
Publicado: (2024)

Contextual Spelling Correction with Language Model for Low-resource Setting
por: Luitel, Nishant, et al.
Publicado: (2024)

CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers
por: Hu, Yong, et al.
Publicado: (2022)

An Empirical Investigation of Domain Adaptation Ability for Chinese Spelling Check Models
por: Wang, Xi, et al.
Publicado: (2024)

Linguistic Analysis of Sinhala YouTube Comments on Sinhala Music Videos: A Dataset Study
por: De Mel, W. M. Yomal, et al.
Publicado: (2025)

M2DS: Multilingual Dataset for Multi-document Summarisation
por: Hewapathirana, Kushan, et al.
Publicado: (2024)

Elementary Math Word Problem Generation using Large Language Models
por: Ariyarathne, Nimesh, et al.
Publicado: (2025)

DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check
por: Qiao, Ziheng, et al.
Publicado: (2024)

Bi-DCSpell: A Bi-directional Detector-Corrector Interactive Framework for Chinese Spelling Check
por: Wu, Haiming, et al.
Publicado: (2024)

FastSpell: the LangId Magic Spell
por: Bañón, Marta, et al.
Publicado: (2024)