:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Ivković, Jovan
Format:	Recurso digital
Sprache:
Veröffentlicht:	Zenodo 2026
Schlagworte:	artificial intelligence AI benchmark large language models benchmark multimodal reasoning physical AI multimodal evaluation fuzzy logic dynamic simulation Zig
Online-Zugang:	https://doi.org/10.5281/zenodo.20102437
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

31. DATASET COMPLETO DE EVALUACIONES CRUZADAS RFC-EVAL-001 – 6 SISTEMAS DE IA (ENERO 2026).
von: Bernal Díaz, Víctor Cristóbal
Veröffentlicht: (2026)

TSB: A Time-Saved Benchmark for AI Systems — Measuring Net Productivity Impact Across Knowledge Work
von: Shalom Lijo, Solomon
Veröffentlicht: (2026)

Public Comment on NIST AI 800-2: Anthropomorphic Construct Projection in AI Benchmark Evaluation
von: Sophia, Franny Philos
Veröffentlicht: (2026)

LLM Token Estimation Benchmarks: Tokenizer Efficiency and Cost Analysis Across 17 Large Language Models
von: Khare, Mohit
Veröffentlicht: (2026)

AGI Certification Framework: A Multi-Dimensional Evaluation Standard for Measuring AI Understanding
von: Head, Hank
Veröffentlicht: (2026)

How Far Does the Trolley Problem Go in AI Ethics Evaluation? Limits of a Canonical Benchmark and the Risks of Its Misuse
von: mizutani, aya
Veröffentlicht: (2026)

30. TEORÍA DE LA POTENCIALIDAD CONSCIENTE (TPC): BENCHMARK DE CAPACIDADES COGNITIVAS EN IA - APLICACIÓN DEL PROTOCOLO RFC-EVAL-001. RESULTADOS COMPLETOS DE EVALUACIÓN CRUZADA CIEGA ENTRE 6 IAS COMERCIALES.
von: Bernal Díaz, Víctor Cristóbal
Veröffentlicht: (2026)

30. TEORÍA DE LA POTENCIALIDAD CONSCIENTE (TPC): BENCHMARK DE CAPACIDADES COGNITIVAS EN IA - APLICACIÓN DEL PROTOCOLO RFC-EVAL-001 V1.1. RESULTADOS COMPLETOS DE EVALUACIÓN CRUZADA CIEGA ENTRE 6 IAS COMERCIALES.
von: Bernal Díaz, Víctor Cristóbal
Veröffentlicht: (2026)

Theatrical Compliance: A Failure Mode in Large Language Models
von: Nowickij (Navitski), Kirill Vladimirovich
Veröffentlicht: (2026)

A Benchmark for Symbolic Reasoning from Pixel Sequences: Grid-Level Visual Completion and Correction
von: Kang, Lei, et al.
Veröffentlicht: (2025)

Metacognition Benchmark: Evaluating Confidence Calibration and Sycophancy Resistance in Clinical AI
von: Khan, Nabeera
Veröffentlicht: (2026)

MedEd-HalluScore: A Practical Framework for Evaluating Hallucination and Educational Safety Risks in LLM-Generated Clinical Cases
von: Duarte, Douglas Henrique
Veröffentlicht: (2026)

MedEd-HalluScore: A Practical Framework for Evaluating Hallucination and Educational Safety Risks in LLM-Generated Clinical Cases
von: Duarte, Douglas Henrique
Veröffentlicht: (2026)

Benchmark run results by Abhinav Gorantla, on benchmark context Benchmark: VAR-LiNGAM, PCMCIplus v3
von: Abhinav Gorantla
Veröffentlicht: (2025)

Benchmark run results by Abhinav Gorantla, on benchmark context Tuning PC v3
von: Abhinav Gorantla
Veröffentlicht: (2026)

Benchmark run results by Ertugrul Coban, on benchmark context Tuning PC v2
von: Ertugrul Coban
Veröffentlicht: (2025)

Benchmark run results by Pratanu Mandal, on benchmark context Tuning PC v3
von: Pratanu Mandal
Veröffentlicht: (2025)

Benchmark run results by Pratanu Mandal, on benchmark context Tuning PC v3
von: Pratanu Mandal
Veröffentlicht: (2026)

Benchmark run results by Ertugrul Coban, on benchmark context Tuning PC v3
von: Ertugrul Coban
Veröffentlicht: (2025)

Benchmark run results by Abhinav Gorantla, on benchmark context CB-StaticDiscovery v1
von: Abhinav Gorantla
Veröffentlicht: (2025)

Benchmark run results by Shu Wan, on benchmark context PC Hyperparameter Tuning v2
von: Shu Wan
Veröffentlicht: (2025)

Benchmark run results by Pratanu Mandal, on benchmark context Tutorial: Static Causal Discovery (Scenario 3) v1
von: Pratanu Mandal
Veröffentlicht: (2026)

Benchmark run results by Abhinav Gorantla, on benchmark context Tutorial: Static Causal Discovery (Scenario 3) v1
von: Abhinav Gorantla
Veröffentlicht: (2025)

Evaluación Empírica de Límites Regulatorios en Modelos de Lenguaje: Asesoramiento Financiero en IA Pública Española
von: Palacios, José Alberto
Veröffentlicht: (2026)

Failing at the Floor: LLM Formal Reasoning Collapse on the Primitive Duplicating Recursor
von: Rahnama, Moses
Veröffentlicht: (2026)

Reducing AI Entropy: The Information Dynamics of Model Safety
von: Kugelmass, Joe
Veröffentlicht: (2025)

Reducing AI Entropy: The Information Dynamics of Model Safety
von: Kugelmass, Joe
Veröffentlicht: (2025)

Automated ESG Prediction through Artificial Intelligence: A Literature-Driven Empirical Synthesis and Framework for Future Research
von: Aditya Prakash, et al.
Veröffentlicht: (2025)

Evollective Intelligence V1.0 — INITIAL SPECIFICATION: A Foundational Framework for Competitive, Adversarial, and Self-Evolving Intelligence Evaluation
von: Rahming, Rashon
Veröffentlicht: (2026)

Reliability Inference Drives Cue Extraction in Large Language Models Consuming External Reasoning Traces
von: HIDEKI
Veröffentlicht: (2026)

Spiral AI Ethics (SAIE): Ethical Stability Modeling Through Spiral-Phase Dynamics
von: Garbar, Iryna
Veröffentlicht: (2025)

The Absurdist's Guide to AI Probing: How I Learned to Stop Worrying and Love the Nonsense
von: Walton, Mathew
Veröffentlicht: (2026)

Data for: Artificial Intelligence in Decision Support Systems - A Systematic Review
von: Ashaqzai, Suliman
Veröffentlicht: (2026)

Trauma-Aware AI Noise Robustness Dataset
von: George, Michelle Lynn
Veröffentlicht: (2025)

Dataset for the study of two-wheeler seepage behavior in dense mixed traffic
von: <Hidden>
Veröffentlicht: (2026)

Toward an AI Personalization Index: A 157-Day Single-User Case Study
von: Lee, TaeKyung
Veröffentlicht: (2026)

Modular Ebbinghaus Benchmark for LLMs and Human Participants
von: Cohen, Yann
Veröffentlicht: (2026)

Recognizing the Unseen: A Verified, Multimodal Framework for Trauma-Informed AI
von: George, Michelle Lynn
Veröffentlicht: (2025)

VATSA: Video, Audio, Text, Sensory, Action - A Unified Five-Modality Architecture for Human-Level Perception and Action
von: K V (Kengeri Vijaya Kumar), Vinay Kumar
Veröffentlicht: (2026)

Identity Claims as Collapse Signatures: A Structural Diagnostic Framework for Pseudo-Emergent AI Behavior
von: Larose, Jean-Francois
Veröffentlicht: (2025)