Saved in:
| Main Authors: | Tabib, H. M. Shadman, Deedar, Jaber Ahmed |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.04425 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Toward Trustworthy Difficulty Assessments: Large Language Models as Judges in Programming and Synthetic Tasks
by: Tabib, H. M. Shadman, et al.
Published: (2025)
by: Tabib, H. M. Shadman, et al.
Published: (2025)
Study on Locomotive Epidemic Dynamics in a Stochastic Spatio-Temporal Simulation Model on a Multiplex Network
by: Tabib, H. M. Shadman, et al.
Published: (2025)
by: Tabib, H. M. Shadman, et al.
Published: (2025)
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
by: Sun, Haoxiang, et al.
Published: (2025)
by: Sun, Haoxiang, et al.
Published: (2025)
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data
by: Fang, Meng, et al.
Published: (2024)
by: Fang, Meng, et al.
Published: (2024)
BanglaDialecto: An End-to-End AI-Powered Regional Speech Standardization
by: Samin, Md. Nazmus Sadat, et al.
Published: (2024)
by: Samin, Md. Nazmus Sadat, et al.
Published: (2024)
Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation
by: Mahdavi, Sadegh, et al.
Published: (2025)
by: Mahdavi, Sadegh, et al.
Published: (2025)
Can Language Models Solve Olympiad Programming?
by: Shi, Quan, et al.
Published: (2024)
by: Shi, Quan, et al.
Published: (2024)
Leveraging Large Language Models for Bengali Math Word Problem Solving with Chain of Thought Reasoning
by: Paul, Bidyarthi, et al.
Published: (2025)
by: Paul, Bidyarthi, et al.
Published: (2025)
TeleMath: A Benchmark for Large Language Models in Telecom Mathematical Problem Solving
by: Colle, Vincenzo, et al.
Published: (2025)
by: Colle, Vincenzo, et al.
Published: (2025)
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline
by: Xu, Yifan, et al.
Published: (2024)
by: Xu, Yifan, et al.
Published: (2024)
Leveraging Graph Structures and Large Language Models for End-to-End Synthetic Task-Oriented Dialogues
by: Medjad, Maya, et al.
Published: (2025)
by: Medjad, Maya, et al.
Published: (2025)
Using Large Language Model for End-to-End Chinese ASR and NER
by: Li, Yuang, et al.
Published: (2024)
by: Li, Yuang, et al.
Published: (2024)
An End-to-End Speech Summarization Using Large Language Model
by: Shang, Hengchao, et al.
Published: (2024)
by: Shang, Hengchao, et al.
Published: (2024)
Improving Math Problem Solving in Large Language Models Through Categorization and Strategy Tailoring
by: Akella, Amogh
Published: (2024)
by: Akella, Amogh
Published: (2024)
Vashantor: A Large-scale Multilingual Benchmark Dataset for Automated Translation of Bangla Regional Dialects to Bangla Language
by: Faria, Fatema Tuj Johora, et al.
Published: (2023)
by: Faria, Fatema Tuj Johora, et al.
Published: (2023)
Automatic End-to-End Data Integration using Large Language Models
by: Steiner, Aaron, et al.
Published: (2026)
by: Steiner, Aaron, et al.
Published: (2026)
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
by: He, Chaoqun, et al.
Published: (2024)
by: He, Chaoqun, et al.
Published: (2024)
Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
by: Petrov, Ivo, et al.
Published: (2025)
by: Petrov, Ivo, et al.
Published: (2025)
MathLearner: A Large Language Model Agent Framework for Learning to Solve Mathematical Problems
by: Xie, Wenbei, et al.
Published: (2024)
by: Xie, Wenbei, et al.
Published: (2024)
Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem
by: Sun, Yuhong, et al.
Published: (2024)
by: Sun, Yuhong, et al.
Published: (2024)
MathGLM-Vision: Solving Mathematical Problems with Multi-Modal Large Language Model
by: Yang, Zhen, et al.
Published: (2024)
by: Yang, Zhen, et al.
Published: (2024)
End-to-End Ontology Learning with Large Language Models
by: Lo, Andy, et al.
Published: (2024)
by: Lo, Andy, et al.
Published: (2024)
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models
by: Gao, Bofei, et al.
Published: (2024)
by: Gao, Bofei, et al.
Published: (2024)
Building Safe GenAI Applications: An End-to-End Overview of Red Teaming for Large Language Models
by: Purpura, Alberto, et al.
Published: (2025)
by: Purpura, Alberto, et al.
Published: (2025)
Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving
by: Gao, Songyang, et al.
Published: (2025)
by: Gao, Songyang, et al.
Published: (2025)
E2Edev: Benchmarking Large Language Models in End-to-End Software Development Task
by: Liu, Jingyao, et al.
Published: (2025)
by: Liu, Jingyao, et al.
Published: (2025)
End-to-End Graph Flattening Method for Large Language Models
by: Hong, Bin, et al.
Published: (2024)
by: Hong, Bin, et al.
Published: (2024)
End-To-End Clinical Trial Matching with Large Language Models
by: Ferber, Dyke, et al.
Published: (2024)
by: Ferber, Dyke, et al.
Published: (2024)
Solving Math Word Problems via Cooperative Reasoning induced Language Models
by: Zhu, Xinyu, et al.
Published: (2022)
by: Zhu, Xinyu, et al.
Published: (2022)
Performance Evaluation of Large Language Models in Bangla Consumer Health Query Summarization
by: Abrar, Ajwad, et al.
Published: (2025)
by: Abrar, Ajwad, et al.
Published: (2025)
Large Language Models Struggle with Unreasonability in Math Problems
by: Ma, Jingyuan, et al.
Published: (2024)
by: Ma, Jingyuan, et al.
Published: (2024)
OMIBench: Benchmarking Olympiad-Level Multi-Image Reasoning in Large Vision-Language Model
by: Chen, Qiguang, et al.
Published: (2026)
by: Chen, Qiguang, et al.
Published: (2026)
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
by: Chen, Xi, et al.
Published: (2024)
by: Chen, Xi, et al.
Published: (2024)
MathMist: A Parallel Multilingual Benchmark Dataset for Mathematical Problem Solving and Reasoning
by: Sobhani, Mahbub E, et al.
Published: (2025)
by: Sobhani, Mahbub E, et al.
Published: (2025)
Can LLMs Generate and Solve Linguistic Olympiad Puzzles?
by: Majmudar, Neh, et al.
Published: (2025)
by: Majmudar, Neh, et al.
Published: (2025)
PromptCoT: Synthesizing Olympiad-level Problems for Mathematical Reasoning in Large Language Models
by: Zhao, Xueliang, et al.
Published: (2025)
by: Zhao, Xueliang, et al.
Published: (2025)
THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models
by: Liang, Mengfei, et al.
Published: (2024)
by: Liang, Mengfei, et al.
Published: (2024)
Solving for X and Beyond: Can Large Language Models Solve Complex Math Problems with More-Than-Two Unknowns?
by: Kao, Kuei-Chun, et al.
Published: (2024)
by: Kao, Kuei-Chun, et al.
Published: (2024)
A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models
by: Brokman, Aviv, et al.
Published: (2025)
by: Brokman, Aviv, et al.
Published: (2025)
An End-to-End Approach for Child Reading Assessment in the Xhosa Language
by: Chevtchenko, Sergio, et al.
Published: (2025)
by: Chevtchenko, Sergio, et al.
Published: (2025)
Similar Items
-
Toward Trustworthy Difficulty Assessments: Large Language Models as Judges in Programming and Synthetic Tasks
by: Tabib, H. M. Shadman, et al.
Published: (2025) -
Study on Locomotive Epidemic Dynamics in a Stochastic Spatio-Temporal Simulation Model on a Multiplex Network
by: Tabib, H. M. Shadman, et al.
Published: (2025) -
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
by: Sun, Haoxiang, et al.
Published: (2025) -
MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data
by: Fang, Meng, et al.
Published: (2024) -
BanglaDialecto: An End-to-End AI-Powered Regional Speech Standardization
by: Samin, Md. Nazmus Sadat, et al.
Published: (2024)