Saved in:
| Main Authors: | Satharasi, Trivikram, Iyengar, S Sitharama |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.05535 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Evaluating Large Language Models for Causal Modeling
by: Razouk, Houssam, et al.
Published: (2024)
by: Razouk, Houssam, et al.
Published: (2024)
MATH-PT: A Math Reasoning Benchmark for European and Brazilian Portuguese
by: Teixeira, Tiago, et al.
Published: (2026)
by: Teixeira, Tiago, et al.
Published: (2026)
CoMaPOI: A Collaborative Multi-Agent Framework for Next POI Prediction Bridging the Gap Between Trajectory and Language
by: Zhong, Lin, et al.
Published: (2025)
by: Zhong, Lin, et al.
Published: (2025)
RTTC: Reward-Guided Collaborative Test-Time Compute
by: Muñoz, J. Pablo, et al.
Published: (2025)
by: Muñoz, J. Pablo, et al.
Published: (2025)
Deploying Large Language Models With Retrieval Augmented Generation
by: Prabhune, Sonal, et al.
Published: (2024)
by: Prabhune, Sonal, et al.
Published: (2024)
ReFoRCE: A Text-to-SQL Agent with Self-Refinement, Consensus Enforcement, and Column Exploration
by: Deng, Minghang, et al.
Published: (2025)
by: Deng, Minghang, et al.
Published: (2025)
How Data Quality Affects Machine Learning Models for Credit Risk Assessment
by: Maurino, Andrea
Published: (2025)
by: Maurino, Andrea
Published: (2025)
Fanar: An Arabic-Centric Multimodal Generative AI Platform
by: Fanar Team, et al.
Published: (2025)
by: Fanar Team, et al.
Published: (2025)
ChemPro: A Progressive Chemistry Benchmark for Large Language Models
by: Baranwal, Aaditya, et al.
Published: (2026)
by: Baranwal, Aaditya, et al.
Published: (2026)
MedSR-Vision: Deep Learning Framework for Multi-Domain Medical Image Super-Resolution
by: Gurappa, Subhash, et al.
Published: (2026)
by: Gurappa, Subhash, et al.
Published: (2026)
Generative AI for Synthetic Data Generation: Methods, Challenges and the Future
by: Guo, Xu, et al.
Published: (2024)
by: Guo, Xu, et al.
Published: (2024)
Quo Vadis ChatGPT? From Large Language Models to Large Knowledge Models
by: Venkatasubramanian, Venkat, et al.
Published: (2024)
by: Venkatasubramanian, Venkat, et al.
Published: (2024)
Modeling Emotions and Ethics with Large Language Models
by: Chang, Edward Y.
Published: (2024)
by: Chang, Edward Y.
Published: (2024)
AI-Oracle Machines for Intelligent Computing
by: Wang, Jie
Published: (2024)
by: Wang, Jie
Published: (2024)
Aspect-Based Sentiment Analysis for Future Tourism Experiences: A BERT-MoE Framework for Persian User Reviews
by: Taskooh, Hamidreza Kazemi, et al.
Published: (2026)
by: Taskooh, Hamidreza Kazemi, et al.
Published: (2026)
Training Language Models to Win Debates with Self-Play Improves Judge Accuracy
by: Arnesen, Samuel, et al.
Published: (2024)
by: Arnesen, Samuel, et al.
Published: (2024)
OG-RAG: Ontology-Grounded Retrieval-Augmented Generation For Large Language Models
by: Sharma, Kartik, et al.
Published: (2024)
by: Sharma, Kartik, et al.
Published: (2024)
Rewarding Creativity: A Human-Aligned Generative Reward Model for Reinforcement Learning in Storytelling
by: Li, Zhaoyan, et al.
Published: (2026)
by: Li, Zhaoyan, et al.
Published: (2026)
Reducing Selection Bias in Large Language Models
by: Eicher, J. E., et al.
Published: (2024)
by: Eicher, J. E., et al.
Published: (2024)
Revisiting Parameter-Based Knowledge Editing in Large Language Models: Theoretical Limits and Empirical Evidence
by: Ren, Wanying, et al.
Published: (2026)
by: Ren, Wanying, et al.
Published: (2026)
In-Context Learning May Not Elicit Trustworthy Reasoning: A-Not-B Errors in Pretrained Language Models
by: Han, Pengrui, et al.
Published: (2024)
by: Han, Pengrui, et al.
Published: (2024)
OpenAI Cribbed Our Tax Example, But Can GPT-4 Really Do Tax?
by: Blair-Stanek, Andrew, et al.
Published: (2023)
by: Blair-Stanek, Andrew, et al.
Published: (2023)
Syntactic Blind Spots: How Misalignment Leads to LLMs Mathematical Errors
by: Williamson, Dane, et al.
Published: (2025)
by: Williamson, Dane, et al.
Published: (2025)
LLMs and the Human Condition
by: Wallis, Peter
Published: (2024)
by: Wallis, Peter
Published: (2024)
Thinking Like a Student: AI-Supported Reflective Planning in a Theory-Intensive Computer Science Course
by: Izsak, Noa
Published: (2025)
by: Izsak, Noa
Published: (2025)
TwinVoice: A Multi-dimensional Benchmark Towards Digital Twins via LLM Persona Simulation
by: Du, Bangde, et al.
Published: (2025)
by: Du, Bangde, et al.
Published: (2025)
The Information-Theoretic Imperative: Compression and the Epistemic Foundations of Intelligence
by: Dittrich, Christian, et al.
Published: (2025)
by: Dittrich, Christian, et al.
Published: (2025)
Graph Language Models
by: Plenz, Moritz, et al.
Published: (2024)
by: Plenz, Moritz, et al.
Published: (2024)
Beyond Recall: Behavioral Specification as an Interpretive Layer for AI Personalization
by: Gulaya, Aarik
Published: (2026)
by: Gulaya, Aarik
Published: (2026)
Incentives or Ontology? A Structural Rebuttal to OpenAI's Hallucination Thesis
by: Ackermann, Richard, et al.
Published: (2025)
by: Ackermann, Richard, et al.
Published: (2025)
ConciseRL: Conciseness-Guided Reinforcement Learning for Efficient Reasoning Models
by: Dumitru, Razvan-Gabriel, et al.
Published: (2025)
by: Dumitru, Razvan-Gabriel, et al.
Published: (2025)
NRR-Core: Non-Resolution Reasoning as a Computational Framework for Contextual Identity and Ambiguity Preservation
by: Saito, Kei
Published: (2025)
by: Saito, Kei
Published: (2025)
The Drill-Down and Fabricate Test (DDFT): A Protocol for Measuring Epistemic Robustness in Language Models
by: Baxi, Rahul
Published: (2025)
by: Baxi, Rahul
Published: (2025)
BabyReasoningBench: Generating Developmentally-Inspired Reasoning Tasks for Evaluating Baby Language Models
by: Dhole, Kaustubh D.
Published: (2026)
by: Dhole, Kaustubh D.
Published: (2026)
CAG: Chunked Augmented Generation for Google Chrome's Built-in Gemini Nano
by: Surulimuthu, Vivek Vellaiyappan, et al.
Published: (2024)
by: Surulimuthu, Vivek Vellaiyappan, et al.
Published: (2024)
RF-Diffusion: Radio Signal Generation via Time-Frequency Diffusion
by: Chi, Guoxuan, et al.
Published: (2024)
by: Chi, Guoxuan, et al.
Published: (2024)
Behavioural vs. Representational Systematicity in End-to-End Models: An Opinionated Survey
by: Vegner, Ivan, et al.
Published: (2025)
by: Vegner, Ivan, et al.
Published: (2025)
Box Maze: A Process-Control Architecture for Reliable LLM Reasoning
by: Qiang, Zou
Published: (2026)
by: Qiang, Zou
Published: (2026)
Beyond Direct Generation: A Decomposed Approach to Well-Crafted Screenwriting with LLMs
by: Lei, Hang, et al.
Published: (2025)
by: Lei, Hang, et al.
Published: (2025)
QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization
by: Khanna, Danush, et al.
Published: (2025)
by: Khanna, Danush, et al.
Published: (2025)
Similar Items
-
Evaluating Large Language Models for Causal Modeling
by: Razouk, Houssam, et al.
Published: (2024) -
MATH-PT: A Math Reasoning Benchmark for European and Brazilian Portuguese
by: Teixeira, Tiago, et al.
Published: (2026) -
CoMaPOI: A Collaborative Multi-Agent Framework for Next POI Prediction Bridging the Gap Between Trajectory and Language
by: Zhong, Lin, et al.
Published: (2025) -
RTTC: Reward-Guided Collaborative Test-Time Compute
by: Muñoz, J. Pablo, et al.
Published: (2025) -
Deploying Large Language Models With Retrieval Augmented Generation
by: Prabhune, Sonal, et al.
Published: (2024)