:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bondarenko, Alexander, Volk, Denis, Volkov, Dmitrii, Ladish, Jeffrey
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2502.13295
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Hacking CTFs with Plain Agents
by: Turtayev, Rustem, et al.
Published: (2024)

Incomplete Tasks Induce Shutdown Resistance in Some Frontier LLMs
by: Schlatter, Jeremy, et al.
Published: (2025)

LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B
by: Lermen, Simon, et al.
Published: (2023)

LLM Agent Honeypot: Monitoring AI Hacking Agents in the Wild
by: Reworr, et al.
Published: (2024)

Badllama 3: removing safety finetuning from Llama 3 in minutes
by: Volkov, Dmitrii
Published: (2024)

Evaluating AI cyber capabilities with crowdsourced elicitation
by: Petrov, Artem, et al.
Published: (2025)

Language Models Can Autonomously Hack and Self-Replicate
by: Air, Alena, et al.
Published: (2026)

LLM Robustness Against Misinformation in Biomedical Question Answering
by: Bondarenko, Alexander, et al.
Published: (2024)

Social preferences with unstable interactive reasoning: Large language models in economic trust games
by: Jiamin, Ou, et al.
Published: (2025)

Internal states before wait modulate reasoning patterns
by: Troitskii, Dmitrii, et al.
Published: (2025)

Misalignment Bounty: Crowdsourcing AI Agent Misbehavior
by: Turtayev, Rustem, et al.
Published: (2025)

Is Value Functions Estimation with Classification Plug-and-play for Offline Reinforcement Learning?
by: Tarasov, Denis, et al.
Published: (2024)

Advancing site-specific disease and pest management in precision agriculture: From reasoning-driven foundation models to adaptive, feedback-based learning
by: Rai, Nitin, et al.
Published: (2025)

People use fast, goal-directed simulation to reason about novel games
by: Zhang, Cedegao E., et al.
Published: (2024)

Reasonably reasoning AI agents can avoid game-theoretic failures in zero-shot, provably
by: Kang, Enoch Hyunwook
Published: (2026)

Sudoku-Bench: Evaluating creative reasoning with Sudoku variants
by: Seely, Jeffrey, et al.
Published: (2025)

Unsupervised decoding of encoded reasoning using language model interpretability
by: Fang, Ching, et al.
Published: (2025)

Slm-mux: Orchestrating small language models for reasoning
by: Wang, Chenyu, et al.
Published: (2025)

What properties of reasoning supervision are associated with improved downstream model quality?
by: Langner, Mikołaj, et al.
Published: (2026)

Response: Emergent analogical reasoning in large language models
by: Hodel, Damian, et al.
Published: (2023)

Do explanations generalize across large reasoning models?
by: Pal, Koyena, et al.
Published: (2026)

Retrieval-augmented reasoning with lean language models
by: Chan, Ryan Sze-Yin, et al.
Published: (2025)

MathDivide: Improved mathematical reasoning by large language models
by: Srivastava, Saksham Sahai, et al.
Published: (2024)

Code-enabled language models can outperform reasoning models on diverse tasks
by: Zhang, Cedegao E., et al.
Published: (2025)

Mathematical reasoning and the computer
by: Buzzard, Kevin
Published: (2025)

Causal reasoning in difference graphs
by: Assaad, Charles K.
Published: (2024)

Mellow: a small audio language model for reasoning
by: Deshmukh, Soham, et al.
Published: (2025)

Aggregating Low Rank Adapters in Federated Fine-tuning
by: Trautmann, Evelyn, et al.
Published: (2025)

ZNO-Eval: Benchmarking reasoning capabilities of large language models in Ukrainian
by: Syromiatnikov, Mykyta, et al.
Published: (2025)

Replacing thinking with tool usage enables reasoning in small language models
by: Rainone, Corrado, et al.
Published: (2025)

Enhancing reasoning accuracy in large language models during inference time
by: Sharma, Vinay, et al.
Published: (2026)

LLM world models are mental: Output layer evidence of brittle world model use in LLM mechanical reasoning
by: Robertson, Cole, et al.
Published: (2025)

From Single Agent to Multi-Agent: Improving Traffic Signal Control
by: Tislenko, Maksim, et al.
Published: (2024)

Harnessing the power of LLMs for normative reasoning in MASs
by: Savarimuthu, Bastin Tony Roy, et al.
Published: (2024)

CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks
by: Yu, Ping, et al.
Published: (2025)

Building evidence-based knowledge bases from full-text literature for disease-specific biomedical reasoning
by: Zong, Chang, et al.
Published: (2026)

Teaching large language models to reason like expert diagnosticians
by: Buckley, Thomas A., et al.
Published: (2025)

Reinforcement learning fine-tuning of language model for instruction following and math reasoning
by: Han, Yifu, et al.
Published: (2025)

Large language models show fragile cognitive reasoning about human emotions
by: Bhattacharyya, Sree, et al.
Published: (2025)

Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models
by: Ma, Olivia, et al.
Published: (2024)