Spremljeno u:
| Glavni autor: | |
|---|---|
| Format: | Recurso digital |
| Jezik: | |
| Izdano: |
Zenodo
2026
|
| Teme: | |
| Online pristup: | https://doi.org/10.5281/zenodo.20102437 |
| Oznake: |
Dodaj oznaku
Bez oznaka, Budi prvi tko označuje ovaj zapis!
|
| _version_ | 1866902225183309824 |
|---|---|
| author | Ivković, Jovan |
| author_facet | Ivković, Jovan |
| contents | REAL-AI-Benchmark is a suite of real-world reasoning benchmarks for evaluating large language models beyond synthetic leaderboard tasks. It includes GO-1 to GO-6 benchmarks covering symbolic reasoning, algorithmic verification, code generation, reproducibility, and multimodal physical-AI decision-making with analog instrument reading, fuzzy inference, and dynamic simulation. |
| format | Recurso digital |
| id | zenodo_https___doi_org_10_5281_zenodo_20102437 |
| institution | Zenodo |
| language | |
| publishDate | 2026 |
| publisher | Zenodo |
| record_format | zenodo |
| spellingShingle | REAL-AI-Benchmark: Real-World Reasoning and Physical-AI Benchmark Suite Ivković, Jovan artificial intelligence AI benchmark large language models benchmark multimodal reasoning physical AI multimodal evaluation fuzzy logic dynamic simulation Zig REAL-AI-Benchmark is a suite of real-world reasoning benchmarks for evaluating large language models beyond synthetic leaderboard tasks. It includes GO-1 to GO-6 benchmarks covering symbolic reasoning, algorithmic verification, code generation, reproducibility, and multimodal physical-AI decision-making with analog instrument reading, fuzzy inference, and dynamic simulation. |
| title | REAL-AI-Benchmark: Real-World Reasoning and Physical-AI Benchmark Suite |
| topic | artificial intelligence AI benchmark large language models benchmark multimodal reasoning physical AI multimodal evaluation fuzzy logic dynamic simulation Zig |
| url | https://doi.org/10.5281/zenodo.20102437 |