:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Abbas, Alexandra, Waggoner, Celia, Olive, Justin
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2507.06893
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ViSoLex: An Open-Source Repository for Vietnamese Social Media Lexical Normalization
by: Nguyen, Anh Thi-Hoang, et al.
Published: (2025)

Lynx: An Open Source Hallucination Evaluation Model
by: Ravi, Selvan Sunitha, et al.
Published: (2024)

OpenFActScore: Open-Source Atomic Evaluation of Factuality in Text Generation
by: Lage, Lucas Fonseca, et al.
Published: (2025)

Narrative Context Protocol: An Open-Source Storytelling Framework for Generative AI
by: Gerba, Hank
Published: (2025)

SWE-Bench++: A Framework for the Scalable Generation of Software Engineering Benchmarks from Open-Source Repositories
by: Wang, Lilin, et al.
Published: (2025)

Veracity: An Open-Source AI Fact-Checking System
by: Curtis, Taylor Lynn, et al.
Published: (2025)

A Survey on Uncertainty Quantification of Large Language Models: Taxonomy, Open Research Challenges, and Future Directions
by: Shorinwa, Ola, et al.
Published: (2024)

Detecting AI-Generated Sentences in Human-AI Collaborative Hybrid Texts: Challenges, Strategies, and Insights
by: Zeng, Zijie, et al.
Published: (2024)

MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data
by: Han, Tianyu, et al.
Published: (2023)

RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation
by: Luo, Qinyu, et al.
Published: (2024)

Thematic Analysis with Open-Source Generative AI and Machine Learning: A New Method for Inductive Qualitative Codebook Development
by: Katz, Andrew, et al.
Published: (2024)

Promises, Outlooks and Challenges of Diffusion Language Modeling
by: Deschenaux, Justin, et al.
Published: (2024)

Mathematical Reasoning in Large Language Models: Benchmarks, Architectures, Evaluation, and Open Challenges
by: Amjad, Husnain, et al.
Published: (2026)

Is Open Source the Future of AI? A Data-Driven Approach
by: Vake, Domen, et al.
Published: (2025)

Performance Evaluation of Open-Source Large Language Models for Assisting Pathology Report Writing in Japanese
by: Kawai, Masataka, et al.
Published: (2026)

OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
by: Toshniwal, Shubham, et al.
Published: (2024)

OpenHands: An Open Platform for AI Software Developers as Generalist Agents
by: Wang, Xingyao, et al.
Published: (2024)

Understanding LLM Development Through Longitudinal Study: Insights from the Open Ko-LLM Leaderboard
by: Park, Chanjun, et al.
Published: (2024)

Maintaining Journalistic Integrity in the Digital Age: A Comprehensive NLP Framework for Evaluating Online News Content
by: Bojic, Ljubisa, et al.
Published: (2024)

Analyzing Feedback Mechanisms in AI-Generated MCQs: Insights into Readability, Lexical Properties, and Levels of Challenge
by: Yaacoub, Antoun, et al.
Published: (2025)

Use of AI Tools: Guidelines to Maintain Academic Integrity in Computing Colleges
by: El-boghdadi, Hatem M., et al.
Published: (2026)

Benchmarking Open-Source Safety Guard Models: A Comprehensive Evaluation
by: Harsh, Reetu Raj, et al.
Published: (2026)

Orchard: An Open-Source Agentic Modeling Framework
by: Peng, Baolin, et al.
Published: (2026)

Qabas: An Open-Source Arabic Lexicographic Database
by: Jarrar, Mustafa, et al.
Published: (2024)

ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code
by: Tang, Xiangru, et al.
Published: (2023)

Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports
by: Dorfner, Felix J., et al.
Published: (2024)

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
by: Wang, Jun, et al.
Published: (2024)

Localizing AI: Evaluating Open-Weight Language Models for Languages of Baltic States
by: Kapočiūtė-Dzikienė, Jurgita, et al.
Published: (2025)

SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration
by: Chen, Jialong, et al.
Published: (2026)

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
by: Ouyang, Siru, et al.
Published: (2024)

AI-Assisted Systematization for Evaluating GenAI Systems
by: Agarwal, Dhruv, et al.
Published: (2026)

The Invisible Hand of AI Libraries Shaping Open Source Projects and Communities
by: Esposito, Matteo, et al.
Published: (2026)

OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data
by: Du, Yuwen, et al.
Published: (2026)

TinyLlama: An Open-Source Small Language Model
by: Zhang, Peiyuan, et al.
Published: (2024)

Design of an Open-Source Architecture for Neural Machine Translation
by: Lankford, Séamus, et al.
Published: (2024)

Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition
by: Zou, Andy, et al.
Published: (2025)

SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories
by: Bogin, Ben, et al.
Published: (2024)

CT Open: An Open-Access, Uncontaminated, Live Platform for the Open Challenge of Clinical Trial Outcome Prediction
by: Wang, Jianyou, et al.
Published: (2026)

Graphically Speaking: Unmasking Abuse in Social Media with Conversation Insights
by: Nouri, Célia, et al.
Published: (2025)

VerifAI: A Verifiable Open-Source Search Engine for Biomedical Question Answering
by: Košprdić, Miloš, et al.
Published: (2026)