Saved in:
| Main Authors: | Aquino, Angelina A., Miranda, Lester James V., Or, Elsie Marie T. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.20428 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A UD Treebank for Bohairic Coptic
by: Zeldes, Amir, et al.
Published: (2025)
by: Zeldes, Amir, et al.
Published: (2025)
Towards the first UD Treebank of Spoken Italian: the KIParla forest
by: Pannitto, Ludovica
Published: (2024)
by: Pannitto, Ludovica
Published: (2024)
Aligning the Norwegian UD Treebank with Entity and Coreference Information
by: Jørgensen, Tollef Emil, et al.
Published: (2023)
by: Jørgensen, Tollef Emil, et al.
Published: (2023)
Syntactic Transfer to Kyrgyz Using the Treebank Translation Method
by: Alekseev, Anton, et al.
Published: (2024)
by: Alekseev, Anton, et al.
Published: (2024)
Urdu Dependency Parsing and Treebank Development: A Syntactic and Morphological Perspective
by: Habib, Nudrat
Published: (2024)
by: Habib, Nudrat
Published: (2024)
UD-KSL Treebank v1.3: A semi-automated framework for aligning XPOS-extracted units with UPOS tags
by: Sung, Hakyung, et al.
Published: (2025)
by: Sung, Hakyung, et al.
Published: (2025)
Annotating Constructions with UD: the experience of the Italian Constructicon
by: Pannitto, Ludovica, et al.
Published: (2024)
by: Pannitto, Ludovica, et al.
Published: (2024)
Building Tamil Treebanks
by: Sarveswaran, Kengatharaiyer
Published: (2024)
by: Sarveswaran, Kengatharaiyer
Published: (2024)
LuxBank: The First Universal Dependency Treebank for Luxembourgish
by: Plum, Alistair, et al.
Published: (2024)
by: Plum, Alistair, et al.
Published: (2024)
CC-GPX: Extracting High-Quality Annotated Geospatial Data from Common Crawl
by: Ilyankou, Ilya, et al.
Published: (2024)
by: Ilyankou, Ilya, et al.
Published: (2024)
Thai Universal Dependency Treebank
by: Sriwirote, Panyut, et al.
Published: (2024)
by: Sriwirote, Panyut, et al.
Published: (2024)
Constituency Structure over Eojeol in Korean Treebanks
by: Park, Jungyeul, et al.
Published: (2025)
by: Park, Jungyeul, et al.
Published: (2025)
Bridging Discourse Treebanks with a Unified Rhetorical Structure Parser
by: Chistova, Elena
Published: (2025)
by: Chistova, Elena
Published: (2025)
Quantifying Geospatial in the Common Crawl Corpus
by: Ilyankou, Ilya, et al.
Published: (2024)
by: Ilyankou, Ilya, et al.
Published: (2024)
Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation
by: Miranda, Lester James V., et al.
Published: (2026)
by: Miranda, Lester James V., et al.
Published: (2026)
Cross-linguistically Consistent Semantic and Syntactic Annotation of Child-directed Speech
by: Szubert, Ida, et al.
Published: (2021)
by: Szubert, Ida, et al.
Published: (2021)
Parsing the Switch: LLM-Based UD Annotation for Complex Code-Switched and Low-Resource Languages
by: Kellert, Olga, et al.
Published: (2025)
by: Kellert, Olga, et al.
Published: (2025)
Latin Treebanks in Review: An Evaluation of Morphological Tagging Across Time
by: Hudspeth, Marisa, et al.
Published: (2024)
by: Hudspeth, Marisa, et al.
Published: (2024)
Building UD Cairo for Old English in the Classroom
by: Levine, Lauren, et al.
Published: (2025)
by: Levine, Lauren, et al.
Published: (2025)
MaiBaam: A Multi-Dialectal Bavarian Universal Dependency Treebank
by: Blaschke, Verena, et al.
Published: (2024)
by: Blaschke, Verena, et al.
Published: (2024)
Multilinguality at the Edge: Developing Language Models for the Global South
by: Miranda, Lester James V., et al.
Published: (2026)
by: Miranda, Lester James V., et al.
Published: (2026)
Infini-News: Efficiently Queryable Access to 1.3 Billion Processed Common Crawl News Articles
by: Lazzaroni, Ruggero Marino, et al.
Published: (2026)
by: Lazzaroni, Ruggero Marino, et al.
Published: (2026)
K-UD: Revising Korean Universal Dependencies Guidelines
by: Kim, Kyuwon, et al.
Published: (2024)
by: Kim, Kyuwon, et al.
Published: (2024)
Emergent Convergence in Multi-Agent LLM Annotation
by: Parfenova, Angelina, et al.
Published: (2025)
by: Parfenova, Angelina, et al.
Published: (2025)
Contrastive Learning on LLM Back Generation Treebank for Cross-domain Constituency Parsing
by: Guo, Peiming, et al.
Published: (2025)
by: Guo, Peiming, et al.
Published: (2025)
Enriching the NArabizi Treebank: A Multifaceted Approach to Supporting an Under-Resourced Language
by: Riabi, Arij, et al.
Published: (2023)
by: Riabi, Arij, et al.
Published: (2023)
Coconstructions in spoken data: UD annotation guidelines and first results
by: Pannitto, Ludovica, et al.
Published: (2026)
by: Pannitto, Ludovica, et al.
Published: (2026)
Smart Bilingual Focused Crawling of Parallel Documents
by: García-Romero, Cristian, et al.
Published: (2024)
by: García-Romero, Cristian, et al.
Published: (2024)
Sparse Logistic Regression with High-order Features for Automatic Grammar Rule Extraction from Treebanks
by: Herrera, Santiago, et al.
Published: (2024)
by: Herrera, Santiago, et al.
Published: (2024)
Multilingual Nonce Dependency Treebanks: Understanding how Language Models represent and process syntactic structure
by: Arps, David, et al.
Published: (2023)
by: Arps, David, et al.
Published: (2023)
UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages
by: Tessema, Bethel Melesse, et al.
Published: (2024)
by: Tessema, Bethel Melesse, et al.
Published: (2024)
Craw4LLM: Efficient Web Crawling for LLM Pretraining
by: Yu, Shi, et al.
Published: (2025)
by: Yu, Shi, et al.
Published: (2025)
Leveraging Web-Crawled Data for High-Quality Fine-Tuning
by: Zhou, Jing, et al.
Published: (2024)
by: Zhou, Jing, et al.
Published: (2024)
External Sandhi and its Relevance to Syntactic Treebanking
by: Sudheer Kolachina
Published: (2011)
by: Sudheer Kolachina
Published: (2011)
Reading Right: Tagalog Translation Manual. English for Special Purposes Series: Nursing Aide.
by: Berzabal, Ofelia G.
Published: (1977)
by: Berzabal, Ofelia G.
Published: (1977)
Web Page Classification using LLMs for Crawling Support
by: Sasazawa, Yuichi, et al.
Published: (2025)
by: Sasazawa, Yuichi, et al.
Published: (2025)
Reflections and New Directions for Human-Centered Large Language Models
by: Ziems, Caleb, et al.
Published: (2026)
by: Ziems, Caleb, et al.
Published: (2026)
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset
by: Su, Dan, et al.
Published: (2024)
by: Su, Dan, et al.
Published: (2024)
Exploring Multiple Strategies to Improve Multilingual Coreference Resolution in CorefUD
by: Pražák, Ondřej, et al.
Published: (2024)
by: Pražák, Ondřej, et al.
Published: (2024)
Parser agreement and disagreement in L2 Korean UD: Implications for human-in-the-loop annotation
by: Sung, Hakyung, et al.
Published: (2026)
by: Sung, Hakyung, et al.
Published: (2026)
Similar Items
-
A UD Treebank for Bohairic Coptic
by: Zeldes, Amir, et al.
Published: (2025) -
Towards the first UD Treebank of Spoken Italian: the KIParla forest
by: Pannitto, Ludovica
Published: (2024) -
Aligning the Norwegian UD Treebank with Entity and Coreference Information
by: Jørgensen, Tollef Emil, et al.
Published: (2023) -
Syntactic Transfer to Kyrgyz Using the Treebank Translation Method
by: Alekseev, Anton, et al.
Published: (2024) -
Urdu Dependency Parsing and Treebank Development: A Syntactic and Morphological Perspective
by: Habib, Nudrat
Published: (2024)