Saved in:
| Main Authors: | Nazi, Zabir Al, Dipta, Shubhashis Roy, Kar, Sudipta |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.06853 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Omni-Modal Dissonance Benchmark: Systematically Breaking Modality Consensus to Probe Robustness and Calibrated Abstention
by: Nazi, Zabir Al, et al.
Published: (2026)
by: Nazi, Zabir Al, et al.
Published: (2026)
TRIAGE: Evaluating Prospective Metacognitive Control in LLMs under Resource Constraints
by: Nazi, Zabir Al, et al.
Published: (2026)
by: Nazi, Zabir Al, et al.
Published: (2026)
GanitLLM: Difficulty-Aware Bengali Mathematical Reasoning through Curriculum-GRPO
by: Dipta, Shubhashis Roy, et al.
Published: (2026)
by: Dipta, Shubhashis Roy, et al.
Published: (2026)
FedMentor: Domain-Aware Differential Privacy for Heterogeneous Federated LLMs in Mental Health
by: Sarwar, Nobin, et al.
Published: (2025)
by: Sarwar, Nobin, et al.
Published: (2025)
HU at SemEval-2024 Task 8A: Can Contrastive Learning Learn Embeddings to Detect Machine-Generated Text?
by: Dipta, Shubhashis Roy, et al.
Published: (2024)
by: Dipta, Shubhashis Roy, et al.
Published: (2024)
UMBCLU at SemEval-2024 Task 1A and 1C: Semantic Textual Relatedness with and without machine translation
by: Dipta, Shubhashis Roy, et al.
Published: (2024)
by: Dipta, Shubhashis Roy, et al.
Published: (2024)
BanglaTalk: Towards Real-Time Speech Assistance for Bengali Regional Dialects
by: Hasan, Jakir, et al.
Published: (2025)
by: Hasan, Jakir, et al.
Published: (2025)
DecomposeRL: Learning to Ask Useful, Informative, and Diverse Questions for Semi-Supervised, Traceable Claim Verification
by: Dipta, Shubhashis Roy, et al.
Published: (2026)
by: Dipta, Shubhashis Roy, et al.
Published: (2026)
Large language models in healthcare and medical domain: A review
by: Nazi, Zabir Al, et al.
Published: (2023)
by: Nazi, Zabir Al, et al.
Published: (2023)
BanglaLlama: LLaMA for Bangla Language
by: Zehady, Abdullah Khan, et al.
Published: (2024)
by: Zehady, Abdullah Khan, et al.
Published: (2024)
PA3: Policy-Aware Agent Alignment through Chain-of-Thought
by: Dipta, Shubhashis Roy, et al.
Published: (2026)
by: Dipta, Shubhashis Roy, et al.
Published: (2026)
If We May De-Presuppose: Robustly Verifying Claims through Presupposition-Free Question Decomposition
by: Dipta, Shubhashis Roy, et al.
Published: (2025)
by: Dipta, Shubhashis Roy, et al.
Published: (2025)
Are Vision Language Models Cross-Cultural Theory of Mind Reasoners?
by: Nazi, Zabir Al, et al.
Published: (2025)
by: Nazi, Zabir Al, et al.
Published: (2025)
Q2E: Query-to-Event Decomposition for Zero-Shot Multilingual Text-to-Video Retrieval
by: Dipta, Shubhashis Roy, et al.
Published: (2025)
by: Dipta, Shubhashis Roy, et al.
Published: (2025)
Learning How to Use Tools, Not Just When: Pattern-Aware Tool-Integrated Reasoning
by: Xu, Ningning, et al.
Published: (2025)
by: Xu, Ningning, et al.
Published: (2025)
PromptGuard at BLP-2025 Task 1: A Few-Shot Classification Framework Using Majority Voting and Keyword Similarity for Bengali Hate Speech Detection
by: Hossan, Rakib, et al.
Published: (2025)
by: Hossan, Rakib, et al.
Published: (2025)
Cross-Lingual Sentiment Misalignment: Auditing Multilingual Language Models for Inversion Risk, Dialectal Representation, and Affective Stability
by: Lia, Nusrat Jahan, et al.
Published: (2026)
by: Lia, Nusrat Jahan, et al.
Published: (2026)
DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions
by: Fernandez, Nigel, et al.
Published: (2024)
by: Fernandez, Nigel, et al.
Published: (2024)
Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems
by: Khan, Zaid, et al.
Published: (2025)
by: Khan, Zaid, et al.
Published: (2025)
VC-Inspector: Advancing Reference-free Evaluation of Video Captions with Factual Analysis
by: Dipta, Shubhashis Roy, et al.
Published: (2025)
by: Dipta, Shubhashis Roy, et al.
Published: (2025)
Can LLMs $\textit{understand}$ Math? -- Exploring the Pitfalls in Mathematical Reasoning
by: Roy, Tiasa Singha, et al.
Published: (2025)
by: Roy, Tiasa Singha, et al.
Published: (2025)
MathChat: Converse to Tackle Challenging Math Problems with LLM Agents
by: Wu, Yiran, et al.
Published: (2023)
by: Wu, Yiran, et al.
Published: (2023)
PrismRAG: Boosting RAG Factuality with Distractor Resilience and Strategized Reasoning
by: Kachuee, Mohammad, et al.
Published: (2025)
by: Kachuee, Mohammad, et al.
Published: (2025)
Leveraging Large Language Models for Bengali Math Word Problem Solving with Chain of Thought Reasoning
by: Paul, Bidyarthi, et al.
Published: (2025)
by: Paul, Bidyarthi, et al.
Published: (2025)
PII-VisBench: Evaluating Personally Identifiable Information Safety in Vision Language Models Along a Continuum of Visibility
by: Shahariar, G M, et al.
Published: (2026)
by: Shahariar, G M, et al.
Published: (2026)
Understanding the Effects of Distractors on Reasoning Vision-Language Models
by: Bae, Jiyun, et al.
Published: (2025)
by: Bae, Jiyun, et al.
Published: (2025)
AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators
by: Mazumder, Aritra, et al.
Published: (2026)
by: Mazumder, Aritra, et al.
Published: (2026)
DisGeM: Distractor Generation for Multiple Choice Questions with Span Masking
by: Cavusoglu, Devrim, et al.
Published: (2024)
by: Cavusoglu, Devrim, et al.
Published: (2024)
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning
by: Li, Chengpeng, et al.
Published: (2023)
by: Li, Chengpeng, et al.
Published: (2023)
Benchmarking Large Language Models for Math Reasoning Tasks
by: Seßler, Kathrin, et al.
Published: (2024)
by: Seßler, Kathrin, et al.
Published: (2024)
StreetMath: Study of LLMs' Approximation Behaviors
by: Tseng, Chiung-Yi, et al.
Published: (2025)
by: Tseng, Chiung-Yi, et al.
Published: (2025)
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
by: Liu, Zihan, et al.
Published: (2024)
by: Liu, Zihan, et al.
Published: (2024)
Inference-Time Rethinking with Latent Thought Vectors for Math Reasoning
by: Kong, Deqian, et al.
Published: (2026)
by: Kong, Deqian, et al.
Published: (2026)
NFT: Bridging Supervised Learning and Reinforcement Learning in Math Reasoning
by: Chen, Huayu, et al.
Published: (2025)
by: Chen, Huayu, et al.
Published: (2025)
Generating Plausible Distractors for Multiple-Choice Questions via Student Choice Prediction
by: Lee, Yooseop, et al.
Published: (2025)
by: Lee, Yooseop, et al.
Published: (2025)
CDGP: Automatic Cloze Distractor Generation based on Pre-trained Language Model
by: Chiang, Shang-Hsuan, et al.
Published: (2024)
by: Chiang, Shang-Hsuan, et al.
Published: (2024)
Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank
by: Scarlatos, Alexander, et al.
Published: (2024)
by: Scarlatos, Alexander, et al.
Published: (2024)
Transformer-based Joint Modelling for Automatic Essay Scoring and Off-Topic Detection
by: Das, Sourya Dipta, et al.
Published: (2024)
by: Das, Sourya Dipta, et al.
Published: (2024)
Cornerstones or Stumbling Blocks? Deciphering the Rock Tokens in On-Policy Distillation
by: Jiang, Yuxuan, et al.
Published: (2026)
by: Jiang, Yuxuan, et al.
Published: (2026)
Investigating Bias: A Multilingual Pipeline for Generating, Solving, and Evaluating Math Problems with LLMs
by: Mahran, Mariam, et al.
Published: (2025)
by: Mahran, Mariam, et al.
Published: (2025)
Similar Items
-
Omni-Modal Dissonance Benchmark: Systematically Breaking Modality Consensus to Probe Robustness and Calibrated Abstention
by: Nazi, Zabir Al, et al.
Published: (2026) -
TRIAGE: Evaluating Prospective Metacognitive Control in LLMs under Resource Constraints
by: Nazi, Zabir Al, et al.
Published: (2026) -
GanitLLM: Difficulty-Aware Bengali Mathematical Reasoning through Curriculum-GRPO
by: Dipta, Shubhashis Roy, et al.
Published: (2026) -
FedMentor: Domain-Aware Differential Privacy for Heterogeneous Federated LLMs in Mental Health
by: Sarwar, Nobin, et al.
Published: (2025) -
HU at SemEval-2024 Task 8A: Can Contrastive Learning Learn Embeddings to Detect Machine-Generated Text?
by: Dipta, Shubhashis Roy, et al.
Published: (2024)