Saved in:
| Main Authors: | Qian, Lingfei, Zhou, Weipeng, Wang, Yan, Peng, Xueqing, Yi, Han, Zhao, Yilun, Huang, Jimin, Xie, Qianqian, Nie, Jian-yun |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.08127 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance
by: Peng, Xueqing, et al.
Published: (2025)
by: Peng, Xueqing, et al.
Published: (2025)
OrdRankBen: A Novel Ranking Benchmark for Ordinal Relevance in NLP
by: Wang, Yan, et al.
Published: (2025)
by: Wang, Yan, et al.
Published: (2025)
MMAFFBen: A Multilingual and Multimodal Affective Analysis Benchmark for Evaluating LLMs and VLMs
by: Liu, Zhiwei, et al.
Published: (2025)
by: Liu, Zhiwei, et al.
Published: (2025)
Moira: Language-driven Hierarchical Reinforcement Learning for Pair Trading
by: Giannouris, Polydoros, et al.
Published: (2026)
by: Giannouris, Polydoros, et al.
Published: (2026)
Ebisu: Benchmarking Large Language Models in Japanese Finance
by: Peng, Xueqing, et al.
Published: (2026)
by: Peng, Xueqing, et al.
Published: (2026)
FinTagging: Benchmarking LLMs for Extracting and Structuring Financial Information
by: Wang, Yan, et al.
Published: (2025)
by: Wang, Yan, et al.
Published: (2025)
Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation
by: Wang, Yan, et al.
Published: (2026)
by: Wang, Yan, et al.
Published: (2026)
FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark for Evaluating LLMs
by: Wang, Yan, et al.
Published: (2025)
by: Wang, Yan, et al.
Published: (2025)
Can LLM Agents Be CFOs? Benchmarking Long-Horizon Resource Allocation in an Uncertain Enterprise Environment
by: Han, Yi, et al.
Published: (2026)
by: Han, Yi, et al.
Published: (2026)
FinanceMath: Knowledge-Intensive Math Reasoning in Finance Domains
by: Zhao, Yilun, et al.
Published: (2023)
by: Zhao, Yilun, et al.
Published: (2023)
When Agents Trade: Live Multi-Market Trading Benchmark for LLM Agents
by: Qian, Lingfei, et al.
Published: (2025)
by: Qian, Lingfei, et al.
Published: (2025)
Me LLaMA: Foundation Large Language Models for Medical Applications
by: Xie, Qianqian, et al.
Published: (2024)
by: Xie, Qianqian, et al.
Published: (2024)
FinCriticalED: A Visual Benchmark for Financial Fact-Level OCR
by: He, Yueru, et al.
Published: (2025)
by: He, Yueru, et al.
Published: (2025)
TRN-R1-Zero: Text-rich Network Reasoning via LLMs with Reinforcement Learning Only
by: Liu, Yilun, et al.
Published: (2026)
by: Liu, Yilun, et al.
Published: (2026)
INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent
by: Li, Haohang, et al.
Published: (2024)
by: Li, Haohang, et al.
Published: (2024)
CDEMapper: Enhancing NIH Common Data Element Normalization using Large Language Models
by: Wang, Yan, et al.
Published: (2024)
by: Wang, Yan, et al.
Published: (2024)
FLAG-Trader: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading
by: Xiong, Guojun, et al.
Published: (2025)
by: Xiong, Guojun, et al.
Published: (2025)
Retrieval-augmented Large Language Models for Financial Time Series Forecasting
by: Xiao, Mengxi, et al.
Published: (2025)
by: Xiao, Mengxi, et al.
Published: (2025)
HARMONIC: Harnessing LLMs for Tabular Data Synthesis and Privacy Protection
by: Wang, Yuxin, et al.
Published: (2024)
by: Wang, Yuxin, et al.
Published: (2024)
R-Log: Incentivizing Log Analysis Capability in LLMs via Reasoning-based Reinforcement Learning
by: Liu, Yilun, et al.
Published: (2025)
by: Liu, Yilun, et al.
Published: (2025)
The CLEF-2026 FinMMEval Lab: Multilingual and Multimodal Evaluation of Financial AI Systems
by: Xie, Zhuohan, et al.
Published: (2026)
by: Xie, Zhuohan, et al.
Published: (2026)
HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy
by: Xiao, Mengxi, et al.
Published: (2024)
by: Xiao, Mengxi, et al.
Published: (2024)
AAPO: Enhancing the Reasoning Capabilities of LLMs with Advantage Margin
by: Xiong, Jian, et al.
Published: (2025)
by: Xiong, Jian, et al.
Published: (2025)
Dólares or Dollars? Unraveling the Bilingual Prowess of Financial LLMs Between Spanish and English
by: Zhang, Xiao, et al.
Published: (2024)
by: Zhang, Xiao, et al.
Published: (2024)
MedViz: An Agent-based, Visual-guided Research Assistant for Navigating Biomedical Literature
by: He, Huan, et al.
Published: (2026)
by: He, Huan, et al.
Published: (2026)
How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs' Reasoning Capabilities: A Preliminary Experimental Study
by: Ji, Yunjie, et al.
Published: (2025)
by: Ji, Yunjie, et al.
Published: (2025)
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
by: Chen, Mingyang, et al.
Published: (2025)
by: Chen, Mingyang, et al.
Published: (2025)
RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?
by: Dai, Yuyang, et al.
Published: (2026)
by: Dai, Yuyang, et al.
Published: (2026)
CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities
by: Mao, Yujun, et al.
Published: (2024)
by: Mao, Yujun, et al.
Published: (2024)
Legal$Δ$: Enhancing Legal Reasoning in LLMs via Reinforcement Learning with Chain-of-Thought Guided Information Gain
by: Dai, Xin, et al.
Published: (2025)
by: Dai, Xin, et al.
Published: (2025)
An explicit local geometric Langlands for supercuspidal representations: the toral case
by: Yi, Lingfei
Published: (2025)
by: Yi, Lingfei
Published: (2025)
AuditWen:An Open-Source Large Language Model for Audit
by: Huang, Jiajia, et al.
Published: (2024)
by: Huang, Jiajia, et al.
Published: (2024)
A Time‐Delayed Dengue Transmission Model With Seasonal Variations
by: Weipeng Zhang, et al.
Published: (2025)
by: Weipeng Zhang, et al.
Published: (2025)
Revolutionizing Finance with LLMs: An Overview of Applications and Insights
by: Zhao, Huaqin, et al.
Published: (2024)
by: Zhao, Huaqin, et al.
Published: (2024)
Concordia: Self-Improving Synthetic Tables for Federated LLMs
by: Huang, Jimin, et al.
Published: (2026)
by: Huang, Jimin, et al.
Published: (2026)
R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning
by: Chen, Yongchao, et al.
Published: (2025)
by: Chen, Yongchao, et al.
Published: (2025)
From Reviewers' Lens: Understanding Bug Bounty Report Invalid Reasons with LLMs
by: Zheng, Jiangrui, et al.
Published: (2025)
by: Zheng, Jiangrui, et al.
Published: (2025)
MetaAligner: Towards Generalizable Multi-Objective Alignment of Language Models
by: Yang, Kailai, et al.
Published: (2024)
by: Yang, Kailai, et al.
Published: (2024)
MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models
by: Yang, Kailai, et al.
Published: (2023)
by: Yang, Kailai, et al.
Published: (2023)
Selective Preference Optimization via Token-Level Reward Function Estimation
by: Yang, Kailai, et al.
Published: (2024)
by: Yang, Kailai, et al.
Published: (2024)
Similar Items
-
Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance
by: Peng, Xueqing, et al.
Published: (2025) -
OrdRankBen: A Novel Ranking Benchmark for Ordinal Relevance in NLP
by: Wang, Yan, et al.
Published: (2025) -
MMAFFBen: A Multilingual and Multimodal Affective Analysis Benchmark for Evaluating LLMs and VLMs
by: Liu, Zhiwei, et al.
Published: (2025) -
Moira: Language-driven Hierarchical Reinforcement Learning for Pair Trading
by: Giannouris, Polydoros, et al.
Published: (2026) -
Ebisu: Benchmarking Large Language Models in Japanese Finance
by: Peng, Xueqing, et al.
Published: (2026)