Saved in:
| Main Authors: | Bober-Irizar, Mikel, Banerjee, Soumya |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.03507 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Skill Issues: An Analysis of CS:GO Skill Rating Systems
by: Bober-Irizar, Mikel, et al.
Published: (2024)
by: Bober-Irizar, Mikel, et al.
Published: (2024)
When can transformers reason with abstract symbols?
by: Boix-Adsera, Enric, et al.
Published: (2023)
by: Boix-Adsera, Enric, et al.
Published: (2023)
Artificial Expert Intelligence through PAC-reasoning
by: Shalev-Shwartz, Shai, et al.
Published: (2024)
by: Shalev-Shwartz, Shai, et al.
Published: (2024)
Is continuous CoT better suited for multi-lingual reasoning?
by: Bashir, Ali Hamza, et al.
Published: (2026)
by: Bashir, Ali Hamza, et al.
Published: (2026)
Sudoku-Bench: Evaluating creative reasoning with Sudoku variants
by: Seely, Jeffrey, et al.
Published: (2025)
by: Seely, Jeffrey, et al.
Published: (2025)
Are complicated loss functions necessary for teaching LLMs to reason?
by: Carrino, Gabriele, et al.
Published: (2026)
by: Carrino, Gabriele, et al.
Published: (2026)
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
by: Sprague, Zayne, et al.
Published: (2024)
by: Sprague, Zayne, et al.
Published: (2024)
Your thoughts tell who you are: Characterize the reasoning patterns of LRMs
by: Chen, Yida, et al.
Published: (2025)
by: Chen, Yida, et al.
Published: (2025)
Language models show human-like content effects on reasoning tasks
by: Dasgupta, Ishita, et al.
Published: (2022)
by: Dasgupta, Ishita, et al.
Published: (2022)
Neural machine translation of clinical procedure codes for medical diagnosis and uncertainty quantification
by: Chung, Pei-Hung, et al.
Published: (2024)
by: Chung, Pei-Hung, et al.
Published: (2024)
Critique of Impure Reason: Unveiling the reasoning behaviour of medical Large Language Models
by: Sim, Shamus, et al.
Published: (2024)
by: Sim, Shamus, et al.
Published: (2024)
Encode, Think, Decode: Scaling test-time reasoning with recursive latent thoughts
by: Koishekenov, Yeskendir, et al.
Published: (2025)
by: Koishekenov, Yeskendir, et al.
Published: (2025)
A Statistical Framework for Data-dependent Retrieval-Augmented Models
by: Basu, Soumya, et al.
Published: (2024)
by: Basu, Soumya, et al.
Published: (2024)
Counterfactual reasoning: an analysis of in-context emergence
by: Miller, Moritz, et al.
Published: (2025)
by: Miller, Moritz, et al.
Published: (2025)
Explore Theory of Mind: Program-guided adversarial data generation for theory of mind reasoning
by: Sclar, Melanie, et al.
Published: (2024)
by: Sclar, Melanie, et al.
Published: (2024)
Multi-step retrieval and reasoning improves radiology question answering with large language models
by: Wind, Sebastian, et al.
Published: (2025)
by: Wind, Sebastian, et al.
Published: (2025)
LLMs cannot find reasoning errors, but can correct them given the error location
by: Tyen, Gladys, et al.
Published: (2023)
by: Tyen, Gladys, et al.
Published: (2023)
QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?
by: Li, Belinda Z., et al.
Published: (2025)
by: Li, Belinda Z., et al.
Published: (2025)
Catching rationalization in the act: detecting motivated reasoning before and after CoT via activation probing
by: Mirtaheri, Parsa, et al.
Published: (2026)
by: Mirtaheri, Parsa, et al.
Published: (2026)
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
by: Zhang, Beichen, et al.
Published: (2025)
by: Zhang, Beichen, et al.
Published: (2025)
Emergent Misalignment: Narrow finetuning can produce broadly misaligned LLMs
by: Betley, Jan, et al.
Published: (2025)
by: Betley, Jan, et al.
Published: (2025)
Large Language Model Confidence Estimation via Black-Box Access
by: Pedapati, Tejaswini, et al.
Published: (2024)
by: Pedapati, Tejaswini, et al.
Published: (2024)
BertaQA: How Much Do Language Models Know About Local Culture?
by: Etxaniz, Julen, et al.
Published: (2024)
by: Etxaniz, Julen, et al.
Published: (2024)
Augmenting Lateral Thinking in Language Models with Humor and Riddle Data for the BRAINTEASER Task
by: Ghashami, Mina, et al.
Published: (2024)
by: Ghashami, Mina, et al.
Published: (2024)
Language hooks: a modular framework for augmenting LLM reasoning that decouples tool usage from the model and its prompt
by: de Mijolla, Damien, et al.
Published: (2024)
by: de Mijolla, Damien, et al.
Published: (2024)
Enabling robots to follow abstract instructions and complete complex dynamic tasks
by: Mon-Williams, Ruaridh, et al.
Published: (2024)
by: Mon-Williams, Ruaridh, et al.
Published: (2024)
Towards Efficient Neurally-Guided Program Induction for ARC-AGI
by: Ouellette, Simon
Published: (2024)
by: Ouellette, Simon
Published: (2024)
Deep learning and abstractive summarisation for radiological reports: an empirical study for adapting the PEGASUS models' family with scarce data
by: Benzoni, Claudio, et al.
Published: (2025)
by: Benzoni, Claudio, et al.
Published: (2025)
HiTZ at VarDial 2025 NorSID: Overcoming Data Scarcity with Language Transfer and Automatic Data Annotation
by: Bengoetxea, Jaione, et al.
Published: (2024)
by: Bengoetxea, Jaione, et al.
Published: (2024)
Towards Linguistic Neural Representation Learning and Sentence Retrieval from Electroencephalogram Recordings
by: Zhou, Jinzhao, et al.
Published: (2024)
by: Zhou, Jinzhao, et al.
Published: (2024)
ONNX-Net: Towards Universal Representations and Instant Performance Prediction for Neural Architectures
by: Qin, Shiwen, et al.
Published: (2025)
by: Qin, Shiwen, et al.
Published: (2025)
KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning
by: Singh, Vaibhav, et al.
Published: (2025)
by: Singh, Vaibhav, et al.
Published: (2025)
Entertainment chatbot for the digital inclusion of elderly people without abstraction capabilities
by: García-Méndez, Silvia, et al.
Published: (2024)
by: García-Méndez, Silvia, et al.
Published: (2024)
SLOT: Structuring the Output of Large Language Models
by: Wang, Darren Yow-Bang, et al.
Published: (2025)
by: Wang, Darren Yow-Bang, et al.
Published: (2025)
Latxa: An Open Language Model and Evaluation Suite for Basque
by: Etxaniz, Julen, et al.
Published: (2024)
by: Etxaniz, Julen, et al.
Published: (2024)
Evil twins are not that evil: Qualitative insights into machine-generated prompts
by: Rakotonirina, Nathanaël Carraz, et al.
Published: (2024)
by: Rakotonirina, Nathanaël Carraz, et al.
Published: (2024)
PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning
by: Wu, Feijie, et al.
Published: (2025)
by: Wu, Feijie, et al.
Published: (2025)
Explainable machine learning multi-label classification of Spanish legal judgements
by: de Arriba-Pérez, Francisco, et al.
Published: (2024)
by: de Arriba-Pérez, Francisco, et al.
Published: (2024)
Latent Reasoning in TRMs is Secretly a Policy Improvement Operator
by: Asadulaev, Arip, et al.
Published: (2025)
by: Asadulaev, Arip, et al.
Published: (2025)
Exposing propaganda: an analysis of stylistic cues comparing human annotations and machine classification
by: Faye, Géraud, et al.
Published: (2024)
by: Faye, Géraud, et al.
Published: (2024)
Similar Items
-
Skill Issues: An Analysis of CS:GO Skill Rating Systems
by: Bober-Irizar, Mikel, et al.
Published: (2024) -
When can transformers reason with abstract symbols?
by: Boix-Adsera, Enric, et al.
Published: (2023) -
Artificial Expert Intelligence through PAC-reasoning
by: Shalev-Shwartz, Shai, et al.
Published: (2024) -
Is continuous CoT better suited for multi-lingual reasoning?
by: Bashir, Ali Hamza, et al.
Published: (2026) -
Sudoku-Bench: Evaluating creative reasoning with Sudoku variants
by: Seely, Jeffrey, et al.
Published: (2025)