Saved in:
Bibliographic Details
Main Authors: Baiju, Bajiyo, Manohar, Kavya, Pillai, Leena G, Sherly, Elizabeth
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.09957
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • In this work, we present the development of a reverse transliteration model to convert romanized Malayalam to native script using an encoder-decoder framework built with attention-based bidirectional Long Short Term Memory (Bi-LSTM) architecture. To train the model, we have used curated and combined collection of 4.3 million transliteration pairs derived from publicly available Indic language translitertion datasets, Dakshina and Aksharantar. We evaluated the model on two different test dataset provided by IndoNLP-2025-Shared-Task that contain, (1) General typing patterns and (2) Adhoc typing patterns, respectively. On the Test Set-1, we obtained a character error rate (CER) of 7.4%. However upon Test Set-2, with adhoc typing patterns, where most vowel indicators are missing, our model gave a CER of 22.7%.