Saved in:
| Main Authors: | Situngkir, H., Lumbantobing, A. B., Surya, Y. |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.11643 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Tokenizations for Austronesian Language Models: study on languages in Indonesia Archipelago
by: Lumbantobing, Andhika Bernard, et al.
Published: (2026)
by: Lumbantobing, Andhika Bernard, et al.
Published: (2026)
Adaptive Engram Memory System for Indonesian Language Model: Generative AI Based on TOBA LM for Batak and Minang Language
by: Situngkir, Hokky, et al.
Published: (2026)
by: Situngkir, Hokky, et al.
Published: (2026)
LLM-Based Multi-Task Bangla Hate Speech Detection: Type, Severity, and Target
by: Hasan, Md Arid, et al.
Published: (2025)
by: Hasan, Md Arid, et al.
Published: (2025)
Separate Before You Compress: The WWHO Tokenization Architecture
by: Darshana, Kusal
Published: (2026)
by: Darshana, Kusal
Published: (2026)
Multilingual and Multimodal LLMs in the Wild: Building for Low-Resource Languages
by: Alam, Firoj, et al.
Published: (2026)
by: Alam, Firoj, et al.
Published: (2026)
PropXplain: Can LLMs Enable Explainable Propaganda Detection?
by: Hasanain, Maram, et al.
Published: (2025)
by: Hasanain, Maram, et al.
Published: (2025)
Sinhala Transliteration: A Comparative Analysis Between Rule-based and Seq2Seq Approaches
by: De Mel, Yomal, et al.
Published: (2024)
by: De Mel, Yomal, et al.
Published: (2024)
Large Language Models for Propaganda Span Annotation
by: Hasanain, Maram, et al.
Published: (2023)
by: Hasanain, Maram, et al.
Published: (2023)
LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content
by: Kmainasi, Mohamed Bayan, et al.
Published: (2024)
by: Kmainasi, Mohamed Bayan, et al.
Published: (2024)
AVEC: Bootstrapping Privacy for Local LLMs
by: Gaikwad, Madhava
Published: (2025)
by: Gaikwad, Madhava
Published: (2025)
TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking
by: Nahin, Shahriar Kabir, et al.
Published: (2025)
by: Nahin, Shahriar Kabir, et al.
Published: (2025)
LLMeBench: A Flexible Framework for Accelerating LLMs Benchmarking
by: Dalvi, Fahim, et al.
Published: (2023)
by: Dalvi, Fahim, et al.
Published: (2023)
AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs
by: Mousi, Basel, et al.
Published: (2024)
by: Mousi, Basel, et al.
Published: (2024)
NativQA Framework: Enabling LLMs and VLMs with Native, Local, and Everyday Knowledge
by: Alam, Firoj, et al.
Published: (2025)
by: Alam, Firoj, et al.
Published: (2025)
A Multiple-Fill-in-the-Blank Exam Approach for Enhancing Zero-Resource Hallucination Detection in Large Language Models
by: Munakata, Satoshi, et al.
Published: (2024)
by: Munakata, Satoshi, et al.
Published: (2024)
OASIS: A Multilingual and Multimodal Dataset for Culturally Grounded Spoken Visual QA
by: Alam, Firoj, et al.
Published: (2025)
by: Alam, Firoj, et al.
Published: (2025)
DEM: Distribution Edited Model for Training with Mixed Data Distributions
by: Ram, Dhananjay, et al.
Published: (2024)
by: Ram, Dhananjay, et al.
Published: (2024)
Propaganda to Hate: A Multimodal Analysis of Arabic Memes with Multi-Agent LLMs
by: Alam, Firoj, et al.
Published: (2024)
by: Alam, Firoj, et al.
Published: (2024)
CultranAI at PalmX 2025: Data Augmentation for Cultural Knowledge Representation
by: Bhatti, Hunzalah Hassan, et al.
Published: (2025)
by: Bhatti, Hunzalah Hassan, et al.
Published: (2025)
LAraBench: Benchmarking Arabic AI with Large Language Models
by: Abdelali, Ahmed, et al.
Published: (2023)
by: Abdelali, Ahmed, et al.
Published: (2023)
NativQA: Multilingual Culturally-Aligned Natural Query for LLMs
by: Hasan, Md. Arid, et al.
Published: (2024)
by: Hasan, Md. Arid, et al.
Published: (2024)
Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants
by: Bhatti, Hunzalah Hassan, et al.
Published: (2025)
by: Bhatti, Hunzalah Hassan, et al.
Published: (2025)
Native vs Non-Native Language Prompting: A Comparative Analysis
by: Kmainasi, Mohamed Bayan, et al.
Published: (2024)
by: Kmainasi, Mohamed Bayan, et al.
Published: (2024)
GenAI Content Detection Task 2: AI vs. Human -- Academic Essay Authenticity Challenge
by: Chowdhury, Shammur Absar, et al.
Published: (2024)
by: Chowdhury, Shammur Absar, et al.
Published: (2024)
ThatiAR: Subjectivity Detection in Arabic News Sentences
by: Suwaileh, Reem, et al.
Published: (2024)
by: Suwaileh, Reem, et al.
Published: (2024)
Approaching I/O-optimality for Approximate Attention
by: Papp, Pál András, et al.
Published: (2026)
by: Papp, Pál András, et al.
Published: (2026)
SAGED: A Holistic Bias-Benchmarking Pipeline for Language Models with Customisable Fairness Calibration
by: Guan, Xin, et al.
Published: (2024)
by: Guan, Xin, et al.
Published: (2024)
Mobile Phone Sensor-based Nigerian Driving Dataset to Detect Alcohol-influenced Behaviours
by: Thompson, Iniakpokeikiye Peter, et al.
Published: (2025)
by: Thompson, Iniakpokeikiye Peter, et al.
Published: (2025)
The Ethics Engine: A Modular Pipeline for Accessible Psychometric Assessment of Large Language Models
by: Van Clief, Jake, et al.
Published: (2025)
by: Van Clief, Jake, et al.
Published: (2025)
Drift and selection in LLM text ecosystems
by: Riis, Søren
Published: (2026)
by: Riis, Søren
Published: (2026)
TableMoE: Neuro-Symbolic Routing for Structured Expert Reasoning in Multimodal Table Understanding
by: Zhang, Junwen, et al.
Published: (2025)
by: Zhang, Junwen, et al.
Published: (2025)
Enhancing Trust in LLMs: Algorithms for Comparing and Interpreting LLMs
by: Brown, Nik Bear
Published: (2024)
by: Brown, Nik Bear
Published: (2024)
Uncovering Uncertainty in Transformer Inference
by: Brothers, Greyson, et al.
Published: (2024)
by: Brothers, Greyson, et al.
Published: (2024)
Understanding and Improving Information Preservation in Prompt Compression for LLMs
by: Łajewska, Weronika, et al.
Published: (2025)
by: Łajewska, Weronika, et al.
Published: (2025)
Beyond Accuracy: Decomposing the Reasoning Efficiency of LLMs
by: Kaiser, Daniel, et al.
Published: (2026)
by: Kaiser, Daniel, et al.
Published: (2026)
DRO-InstructZero: Distributionally Robust Prompt Optimization for Large Language Models
by: Li, Yangyang
Published: (2025)
by: Li, Yangyang
Published: (2025)
Thinking Longer, Not Always Smarter: Evaluating LLM Capabilities in Hierarchical Legal Reasoning
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
D-SMART: Enhancing LLM Dialogue Consistency via Dynamic Structured Memory And Reasoning Tree
by: Lei, Xiang, et al.
Published: (2025)
by: Lei, Xiang, et al.
Published: (2025)
LLM-Viterbi: Semantic-Aware Decoding for Convolutional Codes
by: Li, Zhengtong, et al.
Published: (2026)
by: Li, Zhengtong, et al.
Published: (2026)
Mitigating LLM Hallucinations through Domain-Grounded Tiered Retrieval
by: Haque, Md. Asraful, et al.
Published: (2026)
by: Haque, Md. Asraful, et al.
Published: (2026)
Similar Items
-
Tokenizations for Austronesian Language Models: study on languages in Indonesia Archipelago
by: Lumbantobing, Andhika Bernard, et al.
Published: (2026) -
Adaptive Engram Memory System for Indonesian Language Model: Generative AI Based on TOBA LM for Batak and Minang Language
by: Situngkir, Hokky, et al.
Published: (2026) -
LLM-Based Multi-Task Bangla Hate Speech Detection: Type, Severity, and Target
by: Hasan, Md Arid, et al.
Published: (2025) -
Separate Before You Compress: The WWHO Tokenization Architecture
by: Darshana, Kusal
Published: (2026) -
Multilingual and Multimodal LLMs in the Wild: Building for Low-Resource Languages
by: Alam, Firoj, et al.
Published: (2026)