Saved in:
| Main Authors: | Purpura, Alberto, Wang, Li, Badyal, Sahil, Beaufrand, Eugenio, Faulkner, Adam |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.03359 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Deconstructing Instruction-Following: A New Benchmark for Granular Evaluation of Large Language Model Instruction Compliance Abilities
by: Purpura, Alberto, et al.
Published: (2026)
by: Purpura, Alberto, et al.
Published: (2026)
A Multi-Stage Workflow for the Review of Marketing Content with Reasoning Large Language Models
by: Purpura, Alberto, et al.
Published: (2025)
by: Purpura, Alberto, et al.
Published: (2025)
RIFT: Reordered Instruction Following Testbed To Evaluate Instruction Following in Singular Multistep Prompt Structures
by: Jaffe, Andrew, et al.
Published: (2026)
by: Jaffe, Andrew, et al.
Published: (2026)
EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions
by: Wu, Xiaorui, et al.
Published: (2025)
by: Wu, Xiaorui, et al.
Published: (2025)
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
by: Qi, Yunjia, et al.
Published: (2025)
by: Qi, Yunjia, et al.
Published: (2025)
DIALEVAL: Automated Type-Theoretic Evaluation of LLM Instruction Following
by: Basta, Nardine, et al.
Published: (2026)
by: Basta, Nardine, et al.
Published: (2026)
Self-Review Framework for Enhancing Instruction Following Capability of LLM
by: Park, Sihyun
Published: (2025)
by: Park, Sihyun
Published: (2025)
Enhancing and Assessing Instruction-Following with Fine-Grained Instruction Variants
by: Yang, Jiuding, et al.
Published: (2024)
by: Yang, Jiuding, et al.
Published: (2024)
MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
by: Lou, Renze, et al.
Published: (2023)
by: Lou, Renze, et al.
Published: (2023)
LIFEBench: Evaluating Length Instruction Following in Large Language Models
by: Zhang, Wei, et al.
Published: (2025)
by: Zhang, Wei, et al.
Published: (2025)
Financial Instruction Following Evaluation (FIFE)
by: Matlin, Glenn, et al.
Published: (2025)
by: Matlin, Glenn, et al.
Published: (2025)
M-IFEval: Multilingual Instruction-Following Evaluation
by: Dussolle, Antoine, et al.
Published: (2025)
by: Dussolle, Antoine, et al.
Published: (2025)
Instructional Prompt Optimization for Few-Shot LLM-Based Recommendations on Cold-Start Users
by: Yang, Haowei, et al.
Published: (2025)
by: Yang, Haowei, et al.
Published: (2025)
PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization
by: Wang, Yidong, et al.
Published: (2023)
by: Wang, Yidong, et al.
Published: (2023)
Adaptive Instruction Composition for Automated LLM Red-Teaming
by: Zymet, Jesse, et al.
Published: (2026)
by: Zymet, Jesse, et al.
Published: (2026)
RubricEval: A Rubric-Level Meta-Evaluation Benchmark for LLM Judges in Instruction Following
by: Pan, Tianjun, et al.
Published: (2026)
by: Pan, Tianjun, et al.
Published: (2026)
Boosting Instruction Following at Scale
by: Elder, Ben, et al.
Published: (2025)
by: Elder, Ben, et al.
Published: (2025)
The Instruction Gap: LLMs get lost in Following Instruction
by: Tripathi, Vishesh, et al.
Published: (2025)
by: Tripathi, Vishesh, et al.
Published: (2025)
LLM CHESS: Benchmarking Reasoning and Instruction-Following in LLMs through Chess
by: Kolasani, Sai, et al.
Published: (2025)
by: Kolasani, Sai, et al.
Published: (2025)
MaXIFE: Multilingual and Cross-lingual Instruction Following Evaluation
by: Liu, Yile, et al.
Published: (2025)
by: Liu, Yile, et al.
Published: (2025)
Multi-Level Compositional Reasoning for Interactive Instruction Following
by: Bhambri, Suvaansh, et al.
Published: (2023)
by: Bhambri, Suvaansh, et al.
Published: (2023)
Instruction-Following Evaluation in Function Calling for Large Language Models
by: Skripko, Nikolai
Published: (2025)
by: Skripko, Nikolai
Published: (2025)
ReIFE: Re-evaluating Instruction-Following Evaluation
by: Liu, Yixin, et al.
Published: (2024)
by: Liu, Yixin, et al.
Published: (2024)
Embodied Instruction Following in Unknown Environments
by: Wu, Zhenyu, et al.
Published: (2024)
by: Wu, Zhenyu, et al.
Published: (2024)
OctoBench: Benchmarking Scaffold-Aware Instruction Following in Repository-Grounded Agentic Coding
by: Ding, Deming, et al.
Published: (2026)
by: Ding, Deming, et al.
Published: (2026)
On the Multi-turn Instruction Following for Conversational Web Agents
by: Deng, Yang, et al.
Published: (2024)
by: Deng, Yang, et al.
Published: (2024)
Situated Instruction Following
by: Min, So Yeon, et al.
Published: (2024)
by: Min, So Yeon, et al.
Published: (2024)
Neuro-Symbolic Verification on Instruction Following of LLMs
by: Su, Yiming, et al.
Published: (2026)
by: Su, Yiming, et al.
Published: (2026)
HREF: Human Response-Guided Evaluation of Instruction Following in Language Models
by: Lyu, Xinxi, et al.
Published: (2024)
by: Lyu, Xinxi, et al.
Published: (2024)
Beyond Instruction Following: Evaluating Inferential Rule Following of Large Language Models
by: Sun, Wangtao, et al.
Published: (2024)
by: Sun, Wangtao, et al.
Published: (2024)
Procedural Knowledge Improves Agentic LLM Workflows
by: Hsiao, Vincent, et al.
Published: (2025)
by: Hsiao, Vincent, et al.
Published: (2025)
LsrIF: Enhancing Logic-Structured Instruction Following of Large Language Models
by: Ren, Qingyu, et al.
Published: (2026)
by: Ren, Qingyu, et al.
Published: (2026)
RECAST: Expanding the Boundaries of LLMs' Complex Instruction Following with Multi-Constraint Data
by: Guo, Zhengkang, et al.
Published: (2025)
by: Guo, Zhengkang, et al.
Published: (2025)
Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering
by: Adlakha, Vaibhav, et al.
Published: (2023)
by: Adlakha, Vaibhav, et al.
Published: (2023)
Prompt Codebooks: Discrete Compositional Optimization for Language Model Instruction Refinement
by: Nath, Jyotirmoy, et al.
Published: (2026)
by: Nath, Jyotirmoy, et al.
Published: (2026)
Agentic Policy Optimization via Instruction-Policy Co-Evolution
by: Zhou, Han, et al.
Published: (2025)
by: Zhou, Han, et al.
Published: (2025)
VerIF: Verification Engineering for Reinforcement Learning in Instruction Following
by: Peng, Hao, et al.
Published: (2025)
by: Peng, Hao, et al.
Published: (2025)
Instruction Following by Principled Boosting Attention of Large Language Models
by: Guardieiro, Vitoria, et al.
Published: (2025)
by: Guardieiro, Vitoria, et al.
Published: (2025)
LLM Based Bayesian Optimization for Prompt Search
by: Ballew, Adam, et al.
Published: (2025)
by: Ballew, Adam, et al.
Published: (2025)
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models
by: Fu, Tingchen, et al.
Published: (2025)
by: Fu, Tingchen, et al.
Published: (2025)
Similar Items
-
Deconstructing Instruction-Following: A New Benchmark for Granular Evaluation of Large Language Model Instruction Compliance Abilities
by: Purpura, Alberto, et al.
Published: (2026) -
A Multi-Stage Workflow for the Review of Marketing Content with Reasoning Large Language Models
by: Purpura, Alberto, et al.
Published: (2025) -
RIFT: Reordered Instruction Following Testbed To Evaluate Instruction Following in Singular Multistep Prompt Structures
by: Jaffe, Andrew, et al.
Published: (2026) -
EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions
by: Wu, Xiaorui, et al.
Published: (2025) -
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
by: Qi, Yunjia, et al.
Published: (2025)