Saved in:
| Main Authors: | Srivastava, Saurabh, B, Annarose M, P V, Anto, Menon, Shashank, Sukumar, Ajay, T, Adwaith Samod, Philipose, Alan, Prince, Stevin, Thomas, Sooraj |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.19450 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Neurosymbolic Language Reasoning as Satisfiability Modulo Theory
by: Oh, Hyunseok, et al.
Published: (2026)
by: Oh, Hyunseok, et al.
Published: (2026)
DISCERN: Decoding Systematic Errors in Natural Language for Text Classifiers
by: Menon, Rakesh R., et al.
Published: (2024)
by: Menon, Rakesh R., et al.
Published: (2024)
The Point of No Return: Counterfactual Localization of Deceptive Commitment in Language-Model Reasoning
by: Merrill, Scott, et al.
Published: (2026)
by: Merrill, Scott, et al.
Published: (2026)
An Objective Performance Evaluation of the LSTM Networks in Time Series Classification
by: Sunil, Sooraj, et al.
Published: (2026)
by: Sunil, Sooraj, et al.
Published: (2026)
Revisiting Prompt Optimization with Large Reasoning Models-A Case Study on Event Extraction
by: Srivastava, Saurabh, et al.
Published: (2025)
by: Srivastava, Saurabh, et al.
Published: (2025)
A Causal Lens for Evaluating Faithfulness Metrics
by: Zaman, Kerem, et al.
Published: (2025)
by: Zaman, Kerem, et al.
Published: (2025)
INTERACT: Enabling Interactive, Question-Driven Learning in Large Language Models
by: Kendapadi, Aum, et al.
Published: (2024)
by: Kendapadi, Aum, et al.
Published: (2024)
Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and Challenges
by: Quan, Pengrui, et al.
Published: (2025)
by: Quan, Pengrui, et al.
Published: (2025)
Evaluating Chain-of-Thought Reasoning through Reusability and Verifiability
by: Aggarwal, Shashank, et al.
Published: (2026)
by: Aggarwal, Shashank, et al.
Published: (2026)
Continuous Optimization for Decoding Errors
by: Srivastava, Shashank
Published: (2024)
by: Srivastava, Shashank
Published: (2024)
Improved List Size for Folded Reed-Solomon Codes
by: Srivastava, Shashank
Published: (2024)
by: Srivastava, Shashank
Published: (2024)
Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap
by: Lin, Yueqian, et al.
Published: (2025)
by: Lin, Yueqian, et al.
Published: (2025)
Real-Time Performance Benchmarking of TinyML Models in Embedded Systems (PICO: Performance of Inference, CPU, and Operations)
by: Dey, Abhishek, et al.
Published: (2025)
by: Dey, Abhishek, et al.
Published: (2025)
Bridging the Arithmetic Gap: The Cognitive Complexity Benchmark and Financial-PoT for Robust Financial Reasoning
by: Zhao, Boxiang, et al.
Published: (2026)
by: Zhao, Boxiang, et al.
Published: (2026)
JudgeBoard: Benchmarking and Enhancing Small Language Models for Reasoning Evaluation
by: Bi, Zhenyu, et al.
Published: (2025)
by: Bi, Zhenyu, et al.
Published: (2025)
Bullous Lung Disease in Turner Syndrome: An Underrecognized Comorbidity?
by: Stevin Lu, et al.
Published: (2024)
by: Stevin Lu, et al.
Published: (2024)
Ai-Powered Sales Demand Forecasting and Desicion Support system
by: M, Nithin, et al.
Published: (2026)
by: M, Nithin, et al.
Published: (2026)
MetaCluster: Enabling Deep Compression of Kolmogorov-Arnold Network
by: Raffel, Matthew, et al.
Published: (2025)
by: Raffel, Matthew, et al.
Published: (2025)
Enhancing the Diagnostic Evaluation of Thyroid Functionality Using Diffuse Reflectance Spectroscopy and Regression Models
by: W. Anto Win Shalini, et al.
Published: (2025)
by: W. Anto Win Shalini, et al.
Published: (2025)
Integrating Performance Tools in Model Reasoning for GPU Kernel Optimization
by: Nichols, Daniel, et al.
Published: (2025)
by: Nichols, Daniel, et al.
Published: (2025)
Pharmacognostical and Preliminary phytochemical evaluation of Seed kernel of Chinchasthi (Tamarindus indica Linn)
by: MS Megha, et al.
Published: (2026)
by: MS Megha, et al.
Published: (2026)
The intersection of philosophy of language and artificial intelligence: Challenges in replicating human language understanding
by: Sooraj Kumar Maurya
Published: (2024)
by: Sooraj Kumar Maurya
Published: (2024)
Cost Trade-offs of Reasoning and Non-Reasoning Large Language Models in Text-to-SQL
by: Deochake, Saurabh, et al.
Published: (2025)
by: Deochake, Saurabh, et al.
Published: (2025)
An Enigma of Artificial Reason: Investigating the Production-Evaluation Gap in Large Reasoning Models
by: Sun, Mingzhong, et al.
Published: (2026)
by: Sun, Mingzhong, et al.
Published: (2026)
Robustness and Reasoning Fidelity of Large Language Models in Long-Context Code Question Answering
by: Maharaj, Kishan, et al.
Published: (2026)
by: Maharaj, Kishan, et al.
Published: (2026)
Enhancing Domain-Specific Retrieval-Augmented Generation: Synthetic Data Generation and Evaluation using Reasoning Models
by: Jadon, Aryan, et al.
Published: (2025)
by: Jadon, Aryan, et al.
Published: (2025)
Therapeutic Potential Of M@B 40 (M = Mg and Ca) Fullerene as a Drug Delivery System for Gemcitabine Anti‐Lung Cancer Drug: A DFT Approach
by: Abisha Nancy Sukumar, et al.
Published: (2025)
by: Abisha Nancy Sukumar, et al.
Published: (2025)
Mind the Gap: Benchmarking Spatial Reasoning in Vision-Language Models
by: Stogiannidis, Ilias, et al.
Published: (2025)
by: Stogiannidis, Ilias, et al.
Published: (2025)
SocialGaze: Improving the Integration of Human Social Norms in Large Language Models
by: Vijjini, Anvesh Rao, et al.
Published: (2024)
by: Vijjini, Anvesh Rao, et al.
Published: (2024)
LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks
by: Ullah, Saad, et al.
Published: (2023)
by: Ullah, Saad, et al.
Published: (2023)
A Robust Placeability Metric for Model-Free Unified Pick-and-Place Reasoning
by: Wingender, Benno, et al.
Published: (2025)
by: Wingender, Benno, et al.
Published: (2025)
ECG-Reasoning-Benchmark: A Benchmark for Evaluating Clinical Reasoning Capabilities in ECG Interpretation
by: Oh, Jungwoo, et al.
Published: (2026)
by: Oh, Jungwoo, et al.
Published: (2026)
RUPBench: Benchmarking Reasoning Under Perturbations for Robustness Evaluation in Large Language Models
by: Wang, Yuqing, et al.
Published: (2024)
by: Wang, Yuqing, et al.
Published: (2024)
Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks
by: Karia, Rushang, et al.
Published: (2024)
by: Karia, Rushang, et al.
Published: (2024)
Benchmarking Reasoning Robustness in Large Language Models
by: Yu, Tong, et al.
Published: (2025)
by: Yu, Tong, et al.
Published: (2025)
Mind the Gap: Evaluating the Representativeness of Quantitative Medical Language Reasoning LLM Benchmarks for African Disease Burdens
by: Mutisya, Fred, et al.
Published: (2025)
by: Mutisya, Fred, et al.
Published: (2025)
Is Chain-of-Thought Really Not Explainability? Chain-of-Thought Can Be Faithful without Hint Verbalization
by: Zaman, Kerem, et al.
Published: (2025)
by: Zaman, Kerem, et al.
Published: (2025)
List Decoding Expander-Based Codes up to Capacity in Near-Linear Time
by: Srivastava, Shashank, et al.
Published: (2025)
by: Srivastava, Shashank, et al.
Published: (2025)
Point of Order: Action-Aware LLM Persona Modeling for Realistic Civic Simulation
by: Merrill, Scott, et al.
Published: (2025)
by: Merrill, Scott, et al.
Published: (2025)
MOAT: MobileNet‐Optimized Attention Transfer for Robust and Scalable Dermatology Image Classification
by: Pradeep Radhakrishnan, et al.
Published: (2025)
by: Pradeep Radhakrishnan, et al.
Published: (2025)
Similar Items
-
Neurosymbolic Language Reasoning as Satisfiability Modulo Theory
by: Oh, Hyunseok, et al.
Published: (2026) -
DISCERN: Decoding Systematic Errors in Natural Language for Text Classifiers
by: Menon, Rakesh R., et al.
Published: (2024) -
The Point of No Return: Counterfactual Localization of Deceptive Commitment in Language-Model Reasoning
by: Merrill, Scott, et al.
Published: (2026) -
An Objective Performance Evaluation of the LSTM Networks in Time Series Classification
by: Sunil, Sooraj, et al.
Published: (2026) -
Revisiting Prompt Optimization with Large Reasoning Models-A Case Study on Event Extraction
by: Srivastava, Saurabh, et al.
Published: (2025)