Indholdsfortegnelse: :: Library Catalog

Saved in:

Bibliografiske detaljer
Hovedforfatter:	Mahamat Saleh, Samir Adam Annour
Format:	Recurso digital
Sprog:	engelsk
Udgivet:	Zenodo 2026
Fag:	Large Language Models Venture Capital Prediction Model Calibration Imbalanced Classification Rare-Event Prediction Prompt Engineering Financial Decision Support VCBench Artificial Intelligence in Finance Local LLMs
Online adgang:	https://doi.org/10.5281/zenodo.20392941
Tags:	Tilføj Tag Ingen Tags, Vær først til at tagge denne postø!

Indholdsfortegnelse:

Abstract Venture-capital screening is an imbalanced rare-event prediction problem in which only a small fraction of founders produce extreme outcomes, while false positives waste analyst attention and capital. This study examines the calibration failure modes of local large language models on VCBench, a benchmark for predicting founder success from anonymized pre-founding profiles. We evaluate local Ollama inference with two no-thinking Qwen variants, qwen3:32b and Qwen3-30B-A3B-GGUF:Q4_K_M, on stratified 120-profile validation subsets, and compare these results with trivial baselines and a logistic-regression TF-IDF baseline evaluated on both the 120-profile subset and the full 900-profile public validation split. The main finding is that prompt engineering shifts the predicted-positive rate rather than reliably improving discrimination. Few-Shot prompting increases recall mainly by predicting many more positives, while Vanilla prompting is more conservative but still has wide confidence intervals. A simple LR-TF-IDF classifier achieves stronger F0.5 performance on the available validation data than the tested local LLM prompting configurations. These results motivate a reporting standard for rare-event LLM benchmarks: predicted-positive rate, trivial baselines, precision-recall analysis, confusion-matrix counts, and confidence intervals should be reported alongside any Fβ score.

Lignende værker