Guardado en:
| Autores principales: | Gunathilake, Akesh, Karunarathna, Nadil, Bandaranayake, Tharusha, de Silva, Nisansa, Ranathunga, Surangika |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2512.05414 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Sinhala Physical Common Sense Reasoning Dataset for Global PIQA
por: de Silva, Nisansa, et al.
Publicado: (2026)
por: de Silva, Nisansa, et al.
Publicado: (2026)
Utilizing Multilingual Encoders to Improve Large Language Models for Low-Resource Languages
por: Puranegedara, Imalsha, et al.
Publicado: (2025)
por: Puranegedara, Imalsha, et al.
Publicado: (2025)
Improving the quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
por: Fernando, Aloka, et al.
Publicado: (2025)
por: Fernando, Aloka, et al.
Publicado: (2025)
Linguistic Entity Masking to Improve Cross-Lingual Representation of Multilingual Language Models for Low-Resource Languages
por: Fernando, Aloka, et al.
Publicado: (2025)
por: Fernando, Aloka, et al.
Publicado: (2025)
Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
por: Ranathunga, Surangika, et al.
Publicado: (2024)
por: Ranathunga, Surangika, et al.
Publicado: (2024)
Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora
por: Ranathunga, Surangika, et al.
Publicado: (2024)
por: Ranathunga, Surangika, et al.
Publicado: (2024)
Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches
por: De Mel, Yomal, et al.
Publicado: (2024)
por: De Mel, Yomal, et al.
Publicado: (2024)
SinLlama -- A Large Language Model for Sinhala
por: Aravinda, H. W. K., et al.
Publicado: (2025)
por: Aravinda, H. W. K., et al.
Publicado: (2025)
Unsupervised Bilingual Lexicon Induction for Low Resource Languages
por: Rathnayake, Charitha, et al.
Publicado: (2024)
por: Rathnayake, Charitha, et al.
Publicado: (2024)
Unlocking Parameter-Efficient Fine-Tuning for Low-Resource Language Translation
por: Su, Tong, et al.
Publicado: (2024)
por: Su, Tong, et al.
Publicado: (2024)
Zero-shot OCR Accuracy of Low-Resourced Languages: A Comparative Analysis on Sinhala and Tamil
por: Jayatilleke, Nevidu, et al.
Publicado: (2025)
por: Jayatilleke, Nevidu, et al.
Publicado: (2025)
Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language
por: Wickramasinghe, Kasun, et al.
Publicado: (2023)
por: Wickramasinghe, Kasun, et al.
Publicado: (2023)
Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation
por: Thillainathan, Sarubi, et al.
Publicado: (2025)
por: Thillainathan, Sarubi, et al.
Publicado: (2025)
Large Language Models for Ingredient Substitution in Food Recipes using Supervised Fine-tuning and Direct Preference Optimization
por: Senath, Thevin, et al.
Publicado: (2024)
por: Senath, Thevin, et al.
Publicado: (2024)
Survey on Publicly Available Sinhala Natural Language Processing Tools and Research
por: de Silva, Nisansa
Publicado: (2019)
por: de Silva, Nisansa
Publicado: (2019)
Selecting Seed Words for Wordle using Character Statistics
por: de Silva, Nisansa
Publicado: (2022)
por: de Silva, Nisansa
Publicado: (2022)
Extracting Disaster Impacts and Impact Related Locations in Social Media Posts Using Large Language Models
por: Hameed, Sameeah Noreen, et al.
Publicado: (2025)
por: Hameed, Sameeah Noreen, et al.
Publicado: (2025)
A Multi-way Parallel Named Entity Annotated Corpus for English, Tamil and Sinhala
por: Ranathunga, Surangika, et al.
Publicado: (2024)
por: Ranathunga, Surangika, et al.
Publicado: (2024)
SiTSE: Sinhala Text Simplification Dataset and Evaluation
por: Ranathunga, Surangika, et al.
Publicado: (2024)
por: Ranathunga, Surangika, et al.
Publicado: (2024)
Error-Robust Retrieval for Chinese Spelling Check
por: Yin, Xunjian, et al.
Publicado: (2022)
por: Yin, Xunjian, et al.
Publicado: (2022)
SiDiaC: Sinhala Diachronic Corpus
por: Jayatilleke, Nevidu, et al.
Publicado: (2025)
por: Jayatilleke, Nevidu, et al.
Publicado: (2025)
How Good is BLI as an Alignment Measure: A Study in Word Embedding Paradigm
por: Wickramasinghe, Kasun, et al.
Publicado: (2025)
por: Wickramasinghe, Kasun, et al.
Publicado: (2025)
Fine Tuning Named Entity Extraction Models for the Fantasy Domain
por: Sivaganeshan, Aravinth, et al.
Publicado: (2024)
por: Sivaganeshan, Aravinth, et al.
Publicado: (2024)
Mixture of Small and Large Models for Chinese Spelling Check
por: Qiao, Ziheng, et al.
Publicado: (2025)
por: Qiao, Ziheng, et al.
Publicado: (2025)
SHADE: Semantic Hypernym Annotator for Domain-specific Entities -- DnD Domain Use Case
por: Peiris, Akila, et al.
Publicado: (2024)
por: Peiris, Akila, et al.
Publicado: (2024)
GeeSanBhava: Sentiment Tagged Sinhala Music Video Comment Data Set
por: De Mel, Yomal, et al.
Publicado: (2025)
por: De Mel, Yomal, et al.
Publicado: (2025)
DESS: DeBERTa Enhanced Syntactic-Semantic Aspect Sentiment Triplet Extraction
por: Thenuwara, Vishal, et al.
Publicado: (2025)
por: Thenuwara, Vishal, et al.
Publicado: (2025)
A Framework to Assess Multilingual Vulnerabilities of LLMs
por: Tang, Likai, et al.
Publicado: (2025)
por: Tang, Likai, et al.
Publicado: (2025)
Automating Research Synthesis with Domain-Specific Large Language Model Fine-Tuning
por: Susnjak, Teo, et al.
Publicado: (2024)
por: Susnjak, Teo, et al.
Publicado: (2024)
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking
por: Dong, Ming, et al.
Publicado: (2024)
por: Dong, Ming, et al.
Publicado: (2024)
C-LLM: Learn to Check Chinese Spelling Errors Character by Character
por: Li, Kunting, et al.
Publicado: (2024)
por: Li, Kunting, et al.
Publicado: (2024)
Contextual Spelling Correction with Language Model for Low-resource Setting
por: Luitel, Nishant, et al.
Publicado: (2024)
por: Luitel, Nishant, et al.
Publicado: (2024)
CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers
por: Hu, Yong, et al.
Publicado: (2022)
por: Hu, Yong, et al.
Publicado: (2022)
An Empirical Investigation of Domain Adaptation Ability for Chinese Spelling Check Models
por: Wang, Xi, et al.
Publicado: (2024)
por: Wang, Xi, et al.
Publicado: (2024)
Linguistic Analysis of Sinhala YouTube Comments on Sinhala Music Videos: A Dataset Study
por: De Mel, W. M. Yomal, et al.
Publicado: (2025)
por: De Mel, W. M. Yomal, et al.
Publicado: (2025)
M2DS: Multilingual Dataset for Multi-document Summarisation
por: Hewapathirana, Kushan, et al.
Publicado: (2024)
por: Hewapathirana, Kushan, et al.
Publicado: (2024)
Elementary Math Word Problem Generation using Large Language Models
por: Ariyarathne, Nimesh, et al.
Publicado: (2025)
por: Ariyarathne, Nimesh, et al.
Publicado: (2025)
DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check
por: Qiao, Ziheng, et al.
Publicado: (2024)
por: Qiao, Ziheng, et al.
Publicado: (2024)
Bi-DCSpell: A Bi-directional Detector-Corrector Interactive Framework for Chinese Spelling Check
por: Wu, Haiming, et al.
Publicado: (2024)
por: Wu, Haiming, et al.
Publicado: (2024)
FastSpell: the LangId Magic Spell
por: Bañón, Marta, et al.
Publicado: (2024)
por: Bañón, Marta, et al.
Publicado: (2024)
Ejemplares similares
-
Sinhala Physical Common Sense Reasoning Dataset for Global PIQA
por: de Silva, Nisansa, et al.
Publicado: (2026) -
Utilizing Multilingual Encoders to Improve Large Language Models for Low-Resource Languages
por: Puranegedara, Imalsha, et al.
Publicado: (2025) -
Improving the quality of Web-mined Parallel Corpora of Low-Resource Languages using Debiasing Heuristics
por: Fernando, Aloka, et al.
Publicado: (2025) -
Linguistic Entity Masking to Improve Cross-Lingual Representation of Multilingual Language Models for Low-Resource Languages
por: Fernando, Aloka, et al.
Publicado: (2025) -
Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
por: Ranathunga, Surangika, et al.
Publicado: (2024)