Saved in:
| Main Author: | Szeider, Stefan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.21224 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LLM Self-Explanations Fail Semantic Invariance
by: Szeider, Stefan
Published: (2026)
by: Szeider, Stefan
Published: (2026)
CP-Agent: Agentic Constraint Programming
by: Szeider, Stefan
Published: (2025)
by: Szeider, Stefan
Published: (2025)
PBLean: Pseudo-Boolean Proof Certificates for Lean 4
by: Szeider, Stefan
Published: (2026)
by: Szeider, Stefan
Published: (2026)
Algorithm Selection with Zero Domain Knowledge via Text Embeddings
by: Szeider, Stefan
Published: (2026)
by: Szeider, Stefan
Published: (2026)
ASP-Bench: From Natural Language to Logic Programs
by: Szeider, Stefan
Published: (2026)
by: Szeider, Stefan
Published: (2026)
MCP-Solver: Integrating Language Models with Constraint Programming Systems
by: Szeider, Stefan
Published: (2024)
by: Szeider, Stefan
Published: (2024)
Extracting Problem Structure with LLMs for Optimized SAT Local Search
by: Schidler, André, et al.
Published: (2025)
by: Schidler, André, et al.
Published: (2025)
Do Proactive Agents Really Need an LLM to Decide When to Wake and What to Anchor?
by: Liu, Xiaoze, et al.
Published: (2026)
by: Liu, Xiaoze, et al.
Published: (2026)
Doing What They Say, Not What They Reason: Locating the Faithfulness Gap in LLM Agents
by: Wang, Yufeng
Published: (2026)
by: Wang, Yufeng
Published: (2026)
Do Agents Know What They Can't Do? Evaluating Feasibility Awareness in Tool-Using Agents
by: Cheng, Liang, et al.
Published: (2026)
by: Cheng, Liang, et al.
Published: (2026)
When Do LLM Preferences Predict Downstream Behavior?
by: Slama, Katarina, et al.
Published: (2026)
by: Slama, Katarina, et al.
Published: (2026)
Why Do Multi-Agent LLM Systems Fail?
by: Cemri, Mert, et al.
Published: (2025)
by: Cemri, Mert, et al.
Published: (2025)
Explaining Decisions in ML Models: a Parameterized Complexity Analysis (Part I)
by: Ordyniak, Sebastian, et al.
Published: (2025)
by: Ordyniak, Sebastian, et al.
Published: (2025)
Smart Cubing for Graph Search: A Comparative Study
by: Kirchweger, Markus, et al.
Published: (2025)
by: Kirchweger, Markus, et al.
Published: (2025)
Do Cognitively Interpretable Reasoning Traces Improve LLM Performance?
by: Bhambri, Siddhant, et al.
Published: (2025)
by: Bhambri, Siddhant, et al.
Published: (2025)
Do Agent Societies Develop Intellectual Elites? The Hidden Power Laws of Collective Cognition in LLM Multi-Agent Systems
by: Venkatesh, Kavana, et al.
Published: (2026)
by: Venkatesh, Kavana, et al.
Published: (2026)
Learning What to Do and What Not To Do: Offline Imitation from Expert and Undesirable Demonstrations
by: Hoang, Huy, et al.
Published: (2025)
by: Hoang, Huy, et al.
Published: (2025)
Do Role-Playing Agents Practice What They Preach? Belief-Behavior Consistency in LLM-Based Simulations of Human Trust
by: Mannekote, Amogh, et al.
Published: (2025)
by: Mannekote, Amogh, et al.
Published: (2025)
What Do LLM Agents Know About Their World? Task2Quiz: A Paradigm for Studying Environment Understanding
by: Liu, Siyuan, et al.
Published: (2026)
by: Liu, Siyuan, et al.
Published: (2026)
What Can You Do When You Have Zero Rewards During RL?
by: Prakash, Jatin, et al.
Published: (2025)
by: Prakash, Jatin, et al.
Published: (2025)
Generating Streamlining Constraints with Large Language Models
by: Voboril, Florentina, et al.
Published: (2024)
by: Voboril, Florentina, et al.
Published: (2024)
Explaining Decisions in ML Models: a Parameterized Complexity Analysis
by: Ordyniak, Sebastian, et al.
Published: (2024)
by: Ordyniak, Sebastian, et al.
Published: (2024)
Compilation and Fast Model Counting beyond CNF
by: de Colnet, Alexis, et al.
Published: (2025)
by: de Colnet, Alexis, et al.
Published: (2025)
Streamliners for Answer Set Programming
by: Voboril, Florentina, et al.
Published: (2026)
by: Voboril, Florentina, et al.
Published: (2026)
What Would an LLM Do? Evaluating Large Language Models for Policymaking to Alleviate Homelessness
by: Coz, Pierre Le, et al.
Published: (2025)
by: Coz, Pierre Le, et al.
Published: (2025)
When Agents Overtrust Environmental Evidence: An Extensible Agentic Framework for Benchmarking Evidence-Grounding Defects in LLM Agents
by: Sheng, Strick, et al.
Published: (2026)
by: Sheng, Strick, et al.
Published: (2026)
When Agents Say One Thing and Do Another: Validating Elicited Beliefs from LLMs
by: Yamin, Khurram, et al.
Published: (2026)
by: Yamin, Khurram, et al.
Published: (2026)
When Robots Do the Chores: A Benchmark and Agent for Long-Horizon Household Task Execution
by: Zhu, Zilin, et al.
Published: (2026)
by: Zhu, Zilin, et al.
Published: (2026)
HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?
by: Trinh, Tu, et al.
Published: (2026)
by: Trinh, Tu, et al.
Published: (2026)
Do LLM Agents Exhibit Social Behavior?
by: Leng, Yan, et al.
Published: (2023)
by: Leng, Yan, et al.
Published: (2023)
Do Large Language Models Mentalize When They Teach?
by: Harootonian, Sevan K., et al.
Published: (2026)
by: Harootonian, Sevan K., et al.
Published: (2026)
When Do Multi-Agent Systems Outperform? Analysing the Learning Efficiency of Agentic Systems
by: Su, Junwei, et al.
Published: (2026)
by: Su, Junwei, et al.
Published: (2026)
What Is AI Safety? What Do We Want It to Be?
by: Harding, Jacqueline, et al.
Published: (2025)
by: Harding, Jacqueline, et al.
Published: (2025)
What Do Learned Models Measure?
by: Žliobaitė, Indrė
Published: (2026)
by: Žliobaitė, Indrė
Published: (2026)
Do LLMs Capture Embodied Cognition and Cultural Variation? Cross-Linguistic Evidence from Demonstratives
by: Wang, Yu, et al.
Published: (2026)
by: Wang, Yu, et al.
Published: (2026)
When Models Know When They Do Not Know: Calibration, Cascading, and Cleaning
by: Hao, Chenjie, et al.
Published: (2026)
by: Hao, Chenjie, et al.
Published: (2026)
FormGym: Doing Paperwork with Agents
by: Toles, Matthew, et al.
Published: (2025)
by: Toles, Matthew, et al.
Published: (2025)
Applying Cognitive Design Patterns to General LLM Agents
by: Wray, Robert E., et al.
Published: (2025)
by: Wray, Robert E., et al.
Published: (2025)
Do Code LLMs Understand Design Patterns?
by: Pan, Zhenyu, et al.
Published: (2025)
by: Pan, Zhenyu, et al.
Published: (2025)
Agentic Neurosymbolic Collaboration for Mathematical Discovery: A Case Study in Combinatorial Design
by: Xia, Hai, et al.
Published: (2026)
by: Xia, Hai, et al.
Published: (2026)
Similar Items
-
LLM Self-Explanations Fail Semantic Invariance
by: Szeider, Stefan
Published: (2026) -
CP-Agent: Agentic Constraint Programming
by: Szeider, Stefan
Published: (2025) -
PBLean: Pseudo-Boolean Proof Certificates for Lean 4
by: Szeider, Stefan
Published: (2026) -
Algorithm Selection with Zero Domain Knowledge via Text Embeddings
by: Szeider, Stefan
Published: (2026) -
ASP-Bench: From Natural Language to Logic Programs
by: Szeider, Stefan
Published: (2026)