:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fedchin, Aleksandr, Cooperman, Isabel, Chaudhuri, Pramit, Dexter, Joseph P.
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2408.04427
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

KRISTEVA: Close Reading as a Novel Task for Benchmarking Interpretive Reasoning
by: Sui, Peiqi, et al.
Published: (2025)

DafnyMPI: A Dafny Library for Verifying Message-Passing Concurrent Programs
by: Fedchin, Aleksandr, et al.
Published: (2025)

Active Learning for Multilingual Fingerspelling Corpora
by: Wang, Shuai, et al.
Published: (2023)

NLIP_Lab-IITH Multilingual MT System for WAT24 MT Shared Task
by: Brahma, Maharaj, et al.
Published: (2024)

Multilingual and Explainable Text Detoxification with Parallel Corpora
by: Dementieva, Daryna, et al.
Published: (2024)

Multilingual Embedding Probes Fail to Generalize Across Learner Corpora
by: Lyngbaek, Laurits, et al.
Published: (2026)

CAMEO: Collection of Multilingual Emotional Speech Corpora
by: Christop, Iwona, et al.
Published: (2025)

A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models
by: Lin, Peiqin, et al.
Published: (2024)

Applying NLP to iMessages: Understanding Topic Avoidance, Responsiveness, and Sentiment
by: Gerber, Alan, et al.
Published: (2025)

A Survey on Multilingual Large Language Models: Corpora, Alignment, and Bias
by: Xu, Yuemei, et al.
Published: (2024)

Lexical and Statistical Analysis of Bangla Newspaper and Literature: A Corpus-Driven Study on Diversity, Readability, and NLP Adaptation
by: Bhattacharyya, Pramit, et al.
Published: (2025)

BanglaByT5: Byte-Level Modelling for Bangla
by: Bhattacharyya, Pramit, et al.
Published: (2025)

From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora
by: Shen, Yingli, et al.
Published: (2025)

Entity Insertion in Multilingual Linked Corpora: The Case of Wikipedia
by: Feith, Tomás, et al.
Published: (2024)

Model Internal Sleuthing: Finding Lexical Identity and Inflectional Features in Modern Language Models
by: Li, Michael, et al.
Published: (2025)

Leveraging LLMs for Bangla Grammar Error Correction:Error Categorization, Synthetic Data, and Model Evaluation
by: Bhattacharyya, Pramit, et al.
Published: (2024)

A Decision Procedure for Probabilistic Kleene Algebra with Angelic Nondeterminism
by: Ong, Shawn, et al.
Published: (2025)

GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
by: Gyamfi, Lawrence Adu, et al.
Published: (2026)

Batched Low-Rank Adaptation of Foundation Models
by: Wen, Yeming, et al.
Published: (2023)

CodePivot: Bootstrapping Multilingual Transpilation in LLMs via Reinforcement Learning without Parallel Corpora
by: Li, Shangyu, et al.
Published: (2026)

SpeakerSleuth: Can Large Audio-Language Models Judge Speaker Consistency across Multi-turn Dialogues?
by: Lee, Jonggeun, et al.
Published: (2026)

Truth Sleuth and Trend Bender: AI Agents to fact-check YouTube videos and influence opinions
by: Logé, Cécile, et al.
Published: (2025)

LGR2: Language Guided Reward Relabeling for Accelerating Hierarchical Reinforcement Learning
by: Singh, Utsav, et al.
Published: (2024)

On the Limits of Model Merging for Multilinguality in Pre-Training
by: Aycock, Seth, et al.
Published: (2026)

DIWALI: Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context
by: Sahoo, Pramit, et al.
Published: (2025)

Joint Distributions in Probabilistic Semantics
by: Kozen, Dexter, et al.
Published: (2023)

Identifying Emerging Concepts in Large Corpora
by: Ma, Sibo, et al.
Published: (2025)

Validating and Exploring Large Geographic Corpora
by: Dunn, Jonathan
Published: (2024)

Discovering Multi-Scale Semantic Structure in Text Corpora Using Density-Based Trees and LLM Embeddings
by: Haschka, Thomas, et al.
Published: (2025)

Vacaspati: A Diverse Corpus of Bangla Literature
by: Bhattacharyya, Pramit, et al.
Published: (2023)

Align and Shine: Building High-Quality Sentence-Aligned Corpora for Multilingual Text Simplification
by: Hilasaca, Kenji, et al.
Published: (2026)

Mathematical Entities: Corpora and Benchmarks
by: Collard, Jacob, et al.
Published: (2024)

NLIP_Lab-IITH Low-Resource MT System for WMT24 Indic MT Shared Task
by: Sahoo, Pramit, et al.
Published: (2024)

A Demonic Outcome Logic for Randomized Nondeterminism
by: Zilberstein, Noam, et al.
Published: (2024)

Text-to-SPARQL Goes Beyond English: Multilingual Question Answering Over Knowledge Graphs through Human-Inspired Reasoning
by: Perevalov, Aleksandr, et al.
Published: (2025)

New Textual Corpora for Serbian Language Modeling
by: Škorić, Mihailo, et al.
Published: (2024)

Bias in News Summarization: Measures, Pitfalls and Corpora
by: Steen, Julius, et al.
Published: (2023)

Comparable Corpora: Opportunities for New Research Directions
by: Church, Kenneth
Published: (2025)

The Growing Gains and Pains of Iterative Web Corpora Crawling: Insights from South Slavic CLASSLA-web 2.0 Corpora
by: Pungeršek, Taja Kuzman, et al.
Published: (2026)

AI Brown and AI Koditex: LLM-Generated Corpora Comparable to Traditional Corpora of English and Czech Texts
by: Milička, Jiří, et al.
Published: (2025)