Saved in:
| Main Authors: | Dou, Shaoyu, Shen, Yutian, Chen, Mofan, Wang, Zixuan, Xu, Jiajie, Guo, Qi, Shao, Kailai, Chen, Chao, Hu, Haixiang, Shi, Haibo, Min, Min, Zhang, Liwen |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.21591 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models
by: Guo, Xin, et al.
Published: (2023)
by: Guo, Xin, et al.
Published: (2023)
Evaluating Scoring Bias in LLM-as-a-Judge
by: Li, Qingquan, et al.
Published: (2025)
by: Li, Qingquan, et al.
Published: (2025)
VisFinEval: A Scenario-Driven Chinese Multimodal Benchmark for Holistic Financial Understanding
by: Liu, Zhaowei, et al.
Published: (2025)
by: Liu, Zhaowei, et al.
Published: (2025)
FinGAIA: A Chinese Benchmark for AI Agents in Real-World Financial Domain
by: Zeng, Lingfeng, et al.
Published: (2025)
by: Zeng, Lingfeng, et al.
Published: (2025)
FinSight: Towards Real-World Financial Deep Research
by: Jin, Jiajie, et al.
Published: (2025)
by: Jin, Jiajie, et al.
Published: (2025)
UniFinEval: Towards Unified Evaluation of Financial Multimodal Models across Text, Images and Videos
by: Yang, Zhi, et al.
Published: (2026)
by: Yang, Zhi, et al.
Published: (2026)
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement Learning
by: Liu, Zhaowei, et al.
Published: (2025)
by: Liu, Zhaowei, et al.
Published: (2025)
OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain
by: Wang, Shuting, et al.
Published: (2024)
by: Wang, Shuting, et al.
Published: (2024)
FinSafetyBench: Evaluating LLM Safety in Real-World Financial Scenarios
by: Hou, Yutao, et al.
Published: (2026)
by: Hou, Yutao, et al.
Published: (2026)
Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models
by: Zhu, Jie, et al.
Published: (2025)
by: Zhu, Jie, et al.
Published: (2025)
FinMR: A Knowledge-Intensive Multimodal Benchmark for Advanced Financial Reasoning
by: Deng, Shuangyan, et al.
Published: (2025)
by: Deng, Shuangyan, et al.
Published: (2025)
FinDeepResearch: Evaluating Deep Research Agents in Rigorous Financial Analysis
by: Zhu, Fengbin, et al.
Published: (2025)
by: Zhu, Fengbin, et al.
Published: (2025)
FinGuard: Detecting Financial Regulatory Non-Compliance in LLM Interactions
by: Dou, Huaixia, et al.
Published: (2026)
by: Dou, Huaixia, et al.
Published: (2026)
FinReflectKG -- EvalBench: Benchmarking Financial KG with Multi-Dimensional Evaluation
by: Dimino, Fabrizio, et al.
Published: (2025)
by: Dimino, Fabrizio, et al.
Published: (2025)
FinTeam: A Multi-Agent Collaborative Intelligence System for Comprehensive Financial Scenarios
by: Wu, Yingqian, et al.
Published: (2025)
by: Wu, Yingqian, et al.
Published: (2025)
Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning
by: Zheng, Yanjun, et al.
Published: (2025)
by: Zheng, Yanjun, et al.
Published: (2025)
OneEval: Benchmarking LLM Knowledge-intensive Reasoning over Diverse Knowledge Bases
by: Chen, Yongrui, et al.
Published: (2025)
by: Chen, Yongrui, et al.
Published: (2025)
FinCARE: Financial Causal Analysis with Reasoning and Evidence
by: Michel, Alejandro, et al.
Published: (2025)
by: Michel, Alejandro, et al.
Published: (2025)
FinReflectKG -- MultiHop: Financial QA Benchmark for Reasoning with Knowledge Graph Evidence
by: Arun, Abhinav, et al.
Published: (2025)
by: Arun, Abhinav, et al.
Published: (2025)
StressEval: Failure-Driven Dynamic Benchmarking for Knowledge-Intensive Reasoning in Large Language Models
by: Chen, Yongrui, et al.
Published: (2026)
by: Chen, Yongrui, et al.
Published: (2026)
FinBen: A Holistic Financial Benchmark for Large Language Models
by: Xie, Qianqian, et al.
Published: (2024)
by: Xie, Qianqian, et al.
Published: (2024)
FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments
by: Yang, Zhi, et al.
Published: (2026)
by: Yang, Zhi, et al.
Published: (2026)
Event-assisted 12-stop HDR Imaging of Dynamic Scene
by: Guo, Shi, et al.
Published: (2024)
by: Guo, Shi, et al.
Published: (2024)
Frames2Residual: Spatiotemporal Decoupling for Self-Supervised Video Denoising
by: Ji, Mingjie, et al.
Published: (2026)
by: Ji, Mingjie, et al.
Published: (2026)
FinCoT: Grounding Chain-of-Thought in Expert Financial Reasoning
by: Nitarach, Natapong, et al.
Published: (2025)
by: Nitarach, Natapong, et al.
Published: (2025)
FinTradeBench: A Financial Reasoning Benchmark for LLMs
by: Agrawal, Yogesh, et al.
Published: (2026)
by: Agrawal, Yogesh, et al.
Published: (2026)
ICPC-Eval: Probing the Frontiers of LLM Reasoning with Competitive Programming Contests
by: Xu, Shiyi, et al.
Published: (2025)
by: Xu, Shiyi, et al.
Published: (2025)
FinSTaR: Towards Financial Reasoning with Time Series Reasoning Models
by: Lee, Seunghan, et al.
Published: (2026)
by: Lee, Seunghan, et al.
Published: (2026)
AraFinNews: Arabic Financial Summarisation with Domain-Adapted LLMs
by: El-Haj, Mo, et al.
Published: (2025)
by: El-Haj, Mo, et al.
Published: (2025)
FinFlier: Automating Graphical Overlays for Financial Visualizations with Knowledge-Grounding Large Language Model
by: Hao, Jianing, et al.
Published: (2024)
by: Hao, Jianing, et al.
Published: (2024)
FinZero: Launching Multi-modal Financial Time Series Forecast with Large Reasoning Model
by: Wang, Yanlong, et al.
Published: (2025)
by: Wang, Yanlong, et al.
Published: (2025)
FinHEAR: Human Expertise and Adaptive Risk-Aware Temporal Reasoning for Financial Decision-Making
by: Chen, Jiaxiang, et al.
Published: (2025)
by: Chen, Jiaxiang, et al.
Published: (2025)
FinChain: A Symbolic Benchmark for Verifiable Chain-of-Thought Financial Reasoning
by: Xie, Zhuohan, et al.
Published: (2025)
by: Xie, Zhuohan, et al.
Published: (2025)
FinKario: Event-Enhanced Automated Construction of Financial Knowledge Graph
by: Li, Xiang, et al.
Published: (2025)
by: Li, Xiang, et al.
Published: (2025)
FinReflectKG: Agentic Construction and Evaluation of Financial Knowledge Graphs
by: Arun, Abhinav, et al.
Published: (2025)
by: Arun, Abhinav, et al.
Published: (2025)
Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective
by: Sun, Haixiang, et al.
Published: (2024)
by: Sun, Haixiang, et al.
Published: (2024)
FinReasoning: A Hierarchical Benchmark for Reliable Financial Research Reporting
by: Zhu, Yiyun, et al.
Published: (2026)
by: Zhu, Yiyun, et al.
Published: (2026)
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation
by: Luo, Junyu, et al.
Published: (2025)
by: Luo, Junyu, et al.
Published: (2025)
DocFinQA: A Long-Context Financial Reasoning Dataset
by: Reddy, Varshini, et al.
Published: (2024)
by: Reddy, Varshini, et al.
Published: (2024)
FinDebate: Multi-Agent Collaborative Intelligence for Financial Analysis
by: Cai, Tianshi, et al.
Published: (2025)
by: Cai, Tianshi, et al.
Published: (2025)
Similar Items
-
FinEval: A Chinese Financial Domain Knowledge Evaluation Benchmark for Large Language Models
by: Guo, Xin, et al.
Published: (2023) -
Evaluating Scoring Bias in LLM-as-a-Judge
by: Li, Qingquan, et al.
Published: (2025) -
VisFinEval: A Scenario-Driven Chinese Multimodal Benchmark for Holistic Financial Understanding
by: Liu, Zhaowei, et al.
Published: (2025) -
FinGAIA: A Chinese Benchmark for AI Agents in Real-World Financial Domain
by: Zeng, Lingfeng, et al.
Published: (2025) -
FinSight: Towards Real-World Financial Deep Research
by: Jin, Jiajie, et al.
Published: (2025)