Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kang, Zhaolu, Gong, Junhao, Hu, Wenqing, Yin, Shuo, Jiang, Kehan, Fang, Zhicheng, He, Yingjie, Meng, Chunlei, Fu, Rong, Chen, Dongyang, Zheng, Leqi, Jiang, Eric Hanchen, Feng, Yunfei, Leng, Yitong, Zhu, Junfan, Chen, Xiaoyou, Yang, Xi, Xuan, Richeng
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2601.08689
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Large Language Models (LLMs) have shown strong capabilities across many domains, yet their evaluation in financial quantitative tasks remains fragmented and mostly limited to knowledge-centric question answering. We introduce QuantEval, a benchmark that evaluates LLMs across three essential dimensions of quantitative finance: knowledge-based QA, quantitative mathematical reasoning, and quantitative strategy coding. Unlike prior financial benchmarks, QuantEval integrates a CTA-style backtesting framework that executes model-generated strategies and evaluates them using financial performance metrics, enabling a more realistic assessment of quantitative coding ability. We evaluate some state-of-the-art open-source and proprietary LLMs and observe substantial gaps to human experts, particularly in reasoning and strategy coding. Finally, we conduct large-scale supervised fine-tuning and reinforcement learning experiments on domain-aligned data, demonstrating consistent improvements. We hope QuantEval will facilitate research on LLMs' quantitative finance capabilities and accelerate their practical adoption in real-world trading workflows. We additionally release the full deterministic backtesting configuration (asset universe, cost model, and metric definitions) to ensure strict reproducibility.

Similar Items