:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Ball, Thomas, Chen, Shuo, Herley, Cormac
Formato:	Preprint
Publicado:	2024
Materias:	Artificial Intelligence Computation and Language Machine Learning
Acceso en línea:	https://arxiv.org/abs/2409.07638
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Even GPT-5.2 Can't Count to Five: The Case for Zero-Error Horizons in Trustworthy LLMs
por: Sato, Ryoma
Publicado: (2026)

A Logical Fallacy-Informed Framework for Argument Generation
por: Mouchel, Luca, et al.
Publicado: (2024)

Assessing the Impact of Prompting Methods on ChatGPT's Mathematical Capabilities
por: Chen, Yuhao, et al.
Publicado: (2023)

When Can Transformers Count to n?
por: Yehudai, Gilad, et al.
Publicado: (2024)

MAFALDA: A Benchmark and Comprehensive Study of Fallacy Detection and Classification
por: Helwe, Chadi, et al.
Publicado: (2023)

ModelGPT: Unleashing LLM's Capabilities for Tailored Model Generation
por: Tang, Zihao, et al.
Publicado: (2024)

MisSynth: Improving MISSCI Logical Fallacies Classification with Synthetic Data
por: Poliakov, Mykhailo, et al.
Publicado: (2025)

Can we trust the evaluation on ChatGPT?
por: Aiyappa, Rachith, et al.
Publicado: (2023)

Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension
por: Vatsal, Shubham, et al.
Publicado: (2024)

Can LLMs Follow Simple Rules?
por: Mu, Norman, et al.
Publicado: (2023)

Can Post-Training Transform LLMs into Causal Reasoners?
por: Chen, Junqi, et al.
Publicado: (2026)

Quantifying the Capabilities of LLMs across Scale and Precision
por: Badshah, Sher, et al.
Publicado: (2024)

Non-Halting Queries: Exploiting Fixed Points in LLMs
por: Hammouri, Ghaith, et al.
Publicado: (2024)

Addressing the Ecological Fallacy in Larger LMs with Human Context
por: Soni, Nikita, et al.
Publicado: (2026)

How Much Can We Forget about Data Contamination?
por: Bordt, Sebastian, et al.
Publicado: (2024)

How Far Are We From AGI: Are LLMs All We Need?
por: Feng, Tao, et al.
Publicado: (2024)

HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs
por: Chen, Junying, et al.
Publicado: (2023)

Evaluating GPT's Capability in Identifying Stages of Cognitive Impairment from Electronic Health Data
por: Leng, Yu, et al.
Publicado: (2025)

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
por: Chen, Junying, et al.
Publicado: (2024)

Counting Clues: A Lightweight Probabilistic Baseline Can Match an LLM
por: Jia, Furong, et al.
Publicado: (2025)

Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements
por: Li, Ming, et al.
Publicado: (2024)

We're Different, We're the Same: Creative Homogeneity Across LLMs
por: Wenger, Emily, et al.
Publicado: (2025)

Can GRPO Help LLMs Transcend Their Pretraining Origin?
por: Ni, Kangqi, et al.
Publicado: (2025)

Scaling In-Context Online Learning Capability of LLMs via Cross-Episode Meta-RL
por: Lin, Xiaofeng, et al.
Publicado: (2026)

Misclassification in Automated Content Analysis Causes Bias in Regression. Can We Fix It? Yes We Can!
por: TeBlunthuis, Nathan, et al.
Publicado: (2023)

Large Language Models Are Better Logical Fallacy Reasoners with Counterargument, Explanation, and Goal-Aware Prompt Formulation
por: Jeong, Jiwon, et al.
Publicado: (2025)

LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
por: Zhao, Justin, et al.
Publicado: (2024)

How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMs
por: Feng, Guhao, et al.
Publicado: (2024)

Unlocking Reasoning Capabilities in LLMs via Reinforcement Learning Exploration
por: Deng, Wenhao, et al.
Publicado: (2025)

Enhancing Delta Compression in LLMs via SVD-based Quantization Error Minimization
por: Xiong, Boya, et al.
Publicado: (2025)

Beyond Answers: Transferring Reasoning Capabilities to Smaller LLMs Using Multi-Teacher Knowledge Distillation
por: Tian, Yijun, et al.
Publicado: (2024)

Can We Predict Before Executing Machine Learning Agents?
por: Zheng, Jingsheng, et al.
Publicado: (2026)

False Fixed Points: Kantian Feedback, Stable Miscalibration, and Representational Compression in LLMs
por: Okutomi, Akira
Publicado: (2025)

Counterfactual Evaluation Reveals Hidden Capability Profiles in Clinical LLMs and Agents
por: Turk, Matt
Publicado: (2026)

Single layer tiny Co$^4$ outpaces GPT-2 and GPT-BERT
por: Zain, Noor Ul, et al.
Publicado: (2025)

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
por: DeepSeek-AI, et al.
Publicado: (2025)

Aligning Tree-Search Policies with Fixed Token Budgets in Test-Time Scaling of LLMs
por: Miyamoto, Sora, et al.
Publicado: (2026)

Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT?
por: Sun, Yiyou, et al.
Publicado: (2025)

RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization
por: Dong, Yihong, et al.
Publicado: (2025)

BgGPT 1.0: Extending English-centric LLMs to other languages
por: Alexandrov, Anton, et al.
Publicado: (2024)