Saved in:
| Main Authors: | Kabongo, Salomon, D'Souza, Jennifer, Auer, Sören |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.02409 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Exploring the Latest LLMs for Leaderboard Extraction
by: Kabongo, Salomon, et al.
Published: (2024)
by: Kabongo, Salomon, et al.
Published: (2024)
Instruction Finetuning for Leaderboard Generation from Empirical AI Research
by: Kabongo, Salomon, et al.
Published: (2024)
by: Kabongo, Salomon, et al.
Published: (2024)
Large Language Models for Scientific Information Extraction: An Empirical Study for Virology
by: Shamsabadi, Mahsa, et al.
Published: (2024)
by: Shamsabadi, Mahsa, et al.
Published: (2024)
Large Language Models as Evaluators for Scientific Synthesis
by: Evans, Julia, et al.
Published: (2024)
by: Evans, Julia, et al.
Published: (2024)
LLMs4OL 2024 Overview: The 1st Large Language Models for Ontology Learning Challenge
by: Giglou, Hamed Babaei, et al.
Published: (2024)
by: Giglou, Hamed Babaei, et al.
Published: (2024)
LLMs4Synthesis: Leveraging Large Language Models for Scientific Synthesis
by: Giglou, Hamed Babaei, et al.
Published: (2024)
by: Giglou, Hamed Babaei, et al.
Published: (2024)
Fine-tuning and Prompt Engineering with Cognitive Knowledge Graphs for Scholarly Knowledge Organization
by: Rabby, Gollam, et al.
Published: (2024)
by: Rabby, Gollam, et al.
Published: (2024)
OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment
by: Giglou, Hamed Babaei, et al.
Published: (2025)
by: Giglou, Hamed Babaei, et al.
Published: (2025)
Diagnosing Structural Failures in LLM-Based Evidence Extraction for Meta-Analysis
by: Tan, Zhiyin, et al.
Published: (2026)
by: Tan, Zhiyin, et al.
Published: (2026)
A FAIR and Free Prompt-based Research Assistant
by: Shamsabadi, Mahsa, et al.
Published: (2024)
by: Shamsabadi, Mahsa, et al.
Published: (2024)
Scholarly Question Answering using Large Language Models in the NFDI4DataScience Gateway
by: Giglou, Hamed Babaei, et al.
Published: (2024)
by: Giglou, Hamed Babaei, et al.
Published: (2024)
The Leaderboard Illusion
by: Singh, Shivalika, et al.
Published: (2025)
by: Singh, Shivalika, et al.
Published: (2025)
YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering
by: D'Souza, Jennifer, et al.
Published: (2025)
by: D'Souza, Jennifer, et al.
Published: (2025)
Toward Purpose-oriented Topic Model Evaluation enabled by Large Language Models
by: Tan, Zhiyin, et al.
Published: (2025)
by: Tan, Zhiyin, et al.
Published: (2025)
Bridging the Evaluation Gap: Leveraging Large Language Models for Topic Model Evaluation
by: Tan, Zhiyin, et al.
Published: (2025)
by: Tan, Zhiyin, et al.
Published: (2025)
From Keywords to Structured Summaries: Streamlining Scholarly Information Access
by: Shamsabadi, Mahsa, et al.
Published: (2024)
by: Shamsabadi, Mahsa, et al.
Published: (2024)
Publishing FAIR and Machine-actionable Reviews in Materials Science: The Case for Symbolic Knowledge in Neuro-symbolic Artificial Intelligence
by: D'Souza, Jennifer, et al.
Published: (2026)
by: D'Souza, Jennifer, et al.
Published: (2026)
LLMs4SchemaDiscovery: A Human-in-the-Loop Workflow for Scientific Schema Mining with Large Language Models
by: Sadruddin, Sameer, et al.
Published: (2025)
by: Sadruddin, Sameer, et al.
Published: (2025)
Can We Locate and Prevent Stereotypes in LLMs?
by: D'Souza, Alex
Published: (2026)
by: D'Souza, Alex
Published: (2026)
Improving LLM Leaderboards with Psychometrical Methodology
by: Federiakin, Denis
Published: (2025)
by: Federiakin, Denis
Published: (2025)
Evaluating Large Language Models for Structured Science Summarization in the Open Research Knowledge Graph
by: Nechakhin, Vladyslav, et al.
Published: (2024)
by: Nechakhin, Vladyslav, et al.
Published: (2024)
Astro-NER -- Astronomy Named Entity Recognition: Is GPT a Good Domain Expert Annotator?
by: Evans, Julia, et al.
Published: (2024)
by: Evans, Julia, et al.
Published: (2024)
Iterative Hypothesis Generation for Scientific Discovery with Monte Carlo Nash Equilibrium Self-Refining Trees
by: Rabby, Gollam, et al.
Published: (2025)
by: Rabby, Gollam, et al.
Published: (2025)
LEGOBench: Scientific Leaderboard Generation Benchmark
by: Singh, Shruti, et al.
Published: (2024)
by: Singh, Shruti, et al.
Published: (2024)
League: Leaderboard Generation on Demand
by: Wu, Jian, et al.
Published: (2025)
by: Wu, Jian, et al.
Published: (2025)
SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog
by: D'Souza, Jennifer, et al.
Published: (2025)
by: D'Souza, Jennifer, et al.
Published: (2025)
Benchmarking LLM Faithfulness in RAG with Evolving Leaderboards
by: Tamber, Manveer Singh, et al.
Published: (2025)
by: Tamber, Manveer Singh, et al.
Published: (2025)
DeepResearch$^{\text{Eco}}$: A Recursive Agentic Workflow for Complex Scientific Question Answering in Ecology
by: D'Souza, Jennifer, et al.
Published: (2025)
by: D'Souza, Jennifer, et al.
Published: (2025)
Understanding LLM Development Through Longitudinal Study: Insights from the Open Ko-LLM Leaderboard
by: Park, Chanjun, et al.
Published: (2024)
by: Park, Chanjun, et al.
Published: (2024)
Prompt-to-Leaderboard
by: Frick, Evan, et al.
Published: (2025)
by: Frick, Evan, et al.
Published: (2025)
SCI-IDEA: Context-Aware Scientific Ideation Using Token and Sentence Embeddings
by: Keya, Farhana, et al.
Published: (2025)
by: Keya, Farhana, et al.
Published: (2025)
The Trust Paradox: How CS Researchers Engage LLM Leaderboards
by: Sadeghi, Pouya, et al.
Published: (2026)
by: Sadeghi, Pouya, et al.
Published: (2026)
Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
by: Li, Haonan, et al.
Published: (2024)
by: Li, Haonan, et al.
Published: (2024)
La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America
by: Grandury, María, et al.
Published: (2025)
by: Grandury, María, et al.
Published: (2025)
LLMs4OM: Matching Ontologies with Large Language Models
by: Giglou, Hamed Babaei, et al.
Published: (2024)
by: Giglou, Hamed Babaei, et al.
Published: (2024)
OntoAligner Meets Knowledge Graph Embedding Aligners
by: Giglou, Hamed Babaei, et al.
Published: (2025)
by: Giglou, Hamed Babaei, et al.
Published: (2025)
Open Universal Arabic ASR Leaderboard
by: Wang, Yingzhi, et al.
Published: (2024)
by: Wang, Yingzhi, et al.
Published: (2024)
Computational Fact-Checking of Online Discourse: Scoring scientific accuracy in climate change related news articles
by: Wittenborg, Tim, et al.
Published: (2025)
by: Wittenborg, Tim, et al.
Published: (2025)
Misconfidence-based Demonstration Selection for LLM In-Context Learning
by: Xu, Shangqing, et al.
Published: (2024)
by: Xu, Shangqing, et al.
Published: (2024)
A Position Paper on the Automatic Generation of Machine Learning Leaderboards
by: Timmer, Roelien C, et al.
Published: (2025)
by: Timmer, Roelien C, et al.
Published: (2025)
Similar Items
-
Exploring the Latest LLMs for Leaderboard Extraction
by: Kabongo, Salomon, et al.
Published: (2024) -
Instruction Finetuning for Leaderboard Generation from Empirical AI Research
by: Kabongo, Salomon, et al.
Published: (2024) -
Large Language Models for Scientific Information Extraction: An Empirical Study for Virology
by: Shamsabadi, Mahsa, et al.
Published: (2024) -
Large Language Models as Evaluators for Scientific Synthesis
by: Evans, Julia, et al.
Published: (2024) -
LLMs4OL 2024 Overview: The 1st Large Language Models for Ontology Learning Challenge
by: Giglou, Hamed Babaei, et al.
Published: (2024)