:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bhar, Swarnadeep, Naim, Omar, Metheniti, Eleni, Navarri, Bastien, Cabannes, Loïc, Ezzabady, Morteza, Asher, Nicholas
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2509.04470
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Strong hallucinations from negation and how to fix them
by: Asher, Nicholas, et al.
Published: (2024)

SSA: Improving Performance With a Better Scoring Function
by: Naim, Omar, et al.
Published: (2025)

On Explaining with Attention Matrices
by: Naim, Omar, et al.
Published: (2024)

Analyzing limits for in-context learning
by: Naim, Omar, et al.
Published: (2025)

Revisiting the Reliability of Language Models in Instruction-Following
by: Dong, Jianshuo, et al.
Published: (2025)

Research Trends for the Interplay between Large Language Models and Knowledge Graphs
by: Khorashadizadeh, Hanieh, et al.
Published: (2024)

Sparse Activation Editing for Reliable Instruction Following in Narratives
by: Zhao, Runcong, et al.
Published: (2025)

Exploiting Instruction-Following Retrievers for Malicious Information Retrieval
by: BehnamGhader, Parishad, et al.
Published: (2025)

Reliable Extraction of Clinical Follow-Up Instructions: A Hybrid Neural-Symbolic Pipeline
by: Laufer, Michal, et al.
Published: (2026)

Re-examining learning linear functions in context
by: Naim, Omar, et al.
Published: (2024)

Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering
by: Adlakha, Vaibhav, et al.
Published: (2023)

The Instruction Gap: LLMs get lost in Following Instruction
by: Tripathi, Vishesh, et al.
Published: (2025)

LLM CHESS: Benchmarking Reasoning and Instruction-Following in LLMs through Chess
by: Kolasani, Sai, et al.
Published: (2025)

MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
by: Lou, Renze, et al.
Published: (2023)

Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge
by: Saha, Swarnadeep, et al.
Published: (2025)

ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs
by: Chen, Justin Chih-Yao, et al.
Published: (2023)

Training with Pseudo-Code for Instruction Following
by: Kumar, Prince, et al.
Published: (2025)

WildIFEval: Instruction Following in the Wild
by: Lior, Gili, et al.
Published: (2025)

Beyond Instruction Following: Evaluating Inferential Rule Following of Large Language Models
by: Sun, Wangtao, et al.
Published: (2024)

M-IFEval: Multilingual Instruction-Following Evaluation
by: Dussolle, Antoine, et al.
Published: (2025)

UltraIF: Advancing Instruction Following from the Wild
by: An, Kaikai, et al.
Published: (2025)

Instructions are all you need: Self-supervised Reinforcement Learning for Instruction Following
by: Ren, Qingyu, et al.
Published: (2025)

Better Instruction-Following Through Minimum Bayes Risk
by: Wu, Ian, et al.
Published: (2024)

On the Multi-turn Instruction Following for Conversational Web Agents
by: Deng, Yang, et al.
Published: (2024)

Benchmarking Complex Instruction-Following with Multiple Constraints Composition
by: Wen, Bosi, et al.
Published: (2024)

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
by: Peng, Hao, et al.
Published: (2025)

Thinking LLMs: General Instruction Following with Thought Generation
by: Wu, Tianhao, et al.
Published: (2024)

ContextCov: Deriving and Enforcing Executable Constraints from Agent Instruction Files
by: Sharma, Reshabh K
Published: (2026)

Enhancing and Assessing Instruction-Following with Fine-Grained Instruction Variants
by: Yang, Jiuding, et al.
Published: (2024)

Financial Instruction Following Evaluation (FIFE)
by: Matlin, Glenn, et al.
Published: (2025)

How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs?
by: Doostmohammadi, Ehsan, et al.
Published: (2024)

Self-Review Framework for Enhancing Instruction Following Capability of LLM
by: Park, Sihyun
Published: (2025)

MaXIFE: Multilingual and Cross-lingual Instruction Following Evaluation
by: Liu, Yile, et al.
Published: (2025)

LIFEBench: Evaluating Length Instruction Following in Large Language Models
by: Zhang, Wei, et al.
Published: (2025)

DIALEVAL: Automated Type-Theoretic Evaluation of LLM Instruction Following
by: Basta, Nardine, et al.
Published: (2026)

Can Language Models Follow Multiple Turns of Entangled Instructions?
by: Han, Chi, et al.
Published: (2025)

Evolving and Executing Research Plans via Double-Loop Multi-Agent Collaboration
by: Zhang, Zhi, et al.
Published: (2025)

From Meta-Thought to Execution: Cognitively Aligned Post-Training for Generalizable and Reliable LLM Reasoning
by: Wang, Shaojie, et al.
Published: (2026)

Iteration Head: A Mechanistic Study of Chain-of-Thought
by: Cabannes, Vivien, et al.
Published: (2024)

Is In-Context Learning Sufficient for Instruction Following in LLMs?
by: Zhao, Hao, et al.
Published: (2024)