Saved in:
| Main Authors: | Paqaleh, Mohammad Mahdi Samiei, Jamalkhah, Mehdi, Baghshah, Mahdieh Soleymani |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.04940 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Inductive Biases for Zero-shot Systematic Generalization in Language-informed Reinforcement Learning
by: Dijujin, Negin Hashemi, et al.
Published: (2025)
by: Dijujin, Negin Hashemi, et al.
Published: (2025)
The Illusion of Procedural Reasoning: Measuring Long-Horizon FSM Execution in LLMs
by: Samiei, Mahdi, et al.
Published: (2025)
by: Samiei, Mahdi, et al.
Published: (2025)
Bridging Reasoning to Learning: Unmasking Illusions using Complexity Out of Distribution Generalization
by: Paqaleh, Mohammad Mahdi Samiei, et al.
Published: (2025)
by: Paqaleh, Mohammad Mahdi Samiei, et al.
Published: (2025)
Language Plays a Pivotal Role in the Object-Attribute Compositional Generalization of CLIP
by: Abbasi, Reza, et al.
Published: (2024)
by: Abbasi, Reza, et al.
Published: (2024)
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
by: Abootorabi, Mohammad Mahdi, et al.
Published: (2025)
by: Abootorabi, Mohammad Mahdi, et al.
Published: (2025)
Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language?
by: Ghahroodi, Omid, et al.
Published: (2024)
by: Ghahroodi, Omid, et al.
Published: (2024)
LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions
by: Mehri, Faridoun, et al.
Published: (2024)
by: Mehri, Faridoun, et al.
Published: (2024)
SUSD: Structured Unsupervised Skill Discovery through State Factorization
by: Hosseini, Seyed Mohammad Hadi, et al.
Published: (2026)
by: Hosseini, Seyed Mohammad Hadi, et al.
Published: (2026)
Lying to Win: Assessing LLM Deception through Human-AI Games and Parallel-World Probing
by: Marioriyad, Arash, et al.
Published: (2026)
by: Marioriyad, Arash, et al.
Published: (2026)
The Silent Judge: Unacknowledged Shortcut Bias in LLM-as-a-Judge
by: Marioriyad, Arash, et al.
Published: (2025)
by: Marioriyad, Arash, et al.
Published: (2025)
LLM-Agent-Controller: A Universal Multi-Agent Large Language Model System as a Control Engineer
by: Zahedifar, Rasoul, et al.
Published: (2025)
by: Zahedifar, Rasoul, et al.
Published: (2025)
Improving 3D Few-Shot Segmentation with Inference-Time Pseudo-Labeling
by: Mozafari, Mohammad, et al.
Published: (2024)
by: Mozafari, Mohammad, et al.
Published: (2024)
Efficient Adversarial Attacks on High-dimensional Offline Bandits
by: Hosseini, Seyed Mohammad Hadi, et al.
Published: (2026)
by: Hosseini, Seyed Mohammad Hadi, et al.
Published: (2026)
Understanding Counting Mechanisms in Large Language and Vision-Language Models
by: Hasani, Hosein, et al.
Published: (2025)
by: Hasani, Hosein, et al.
Published: (2025)
TextGames: Learning to Self-Play Text-Based Puzzle Games via Language Model Reasoning
by: Hudi, Frederikus, et al.
Published: (2025)
by: Hudi, Frederikus, et al.
Published: (2025)
Visual Structures Helps Visual Reasoning: Addressing the Binding Problem in VLMs
by: Izadi, Amirmohammad, et al.
Published: (2025)
by: Izadi, Amirmohammad, et al.
Published: (2025)
Playing Language Game with LLMs Leads to Jailbreaking
by: Peng, Yu, et al.
Published: (2024)
by: Peng, Yu, et al.
Published: (2024)
TextAtari: 100K Frames Game Playing with Language Agents
by: Li, Wenhao, et al.
Published: (2025)
by: Li, Wenhao, et al.
Published: (2025)
GABInsight: Exploring Gender-Activity Binding Bias in Vision-Language Models
by: Abdollahi, Ali, et al.
Published: (2024)
by: Abdollahi, Ali, et al.
Published: (2024)
Uncovering Grounding IDs: How External Cues Shape Multimodal Binding
by: Hasani, Hosein, et al.
Published: (2025)
by: Hasani, Hosein, et al.
Published: (2025)
Monitoring Emergent Reward Hacking During Generation via Internal Activations
by: Wilhelm, Patrick, et al.
Published: (2026)
by: Wilhelm, Patrick, et al.
Published: (2026)
Quantized Embedding Vectors for Controllable Diffusion Language Models
by: Kang, Cheng, et al.
Published: (2024)
by: Kang, Cheng, et al.
Published: (2024)
The Judge Who Never Admits: Hidden Shortcuts in LLM-based Evaluation
by: Marioriyad, Arash, et al.
Published: (2026)
by: Marioriyad, Arash, et al.
Published: (2026)
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
by: Liu, Bo, et al.
Published: (2025)
by: Liu, Bo, et al.
Published: (2025)
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
by: Chen, Jiaqi, et al.
Published: (2025)
by: Chen, Jiaqi, et al.
Published: (2025)
Language Agents Mirror Human Causal Reasoning Biases. How Can We Help Them Think Like Scientists?
by: GX-Chen, Anthony, et al.
Published: (2025)
by: GX-Chen, Anthony, et al.
Published: (2025)
CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives
by: Saghafian, Armin, et al.
Published: (2024)
by: Saghafian, Armin, et al.
Published: (2024)
Large Language Models for Scientific Idea Generation: A Creativity-Centered Survey
by: Shahhosseini, Fatemeh, et al.
Published: (2025)
by: Shahhosseini, Fatemeh, et al.
Published: (2025)
RPGBENCH: Evaluating Large Language Models as Role-Playing Game Engines
by: Yu, Pengfei, et al.
Published: (2025)
by: Yu, Pengfei, et al.
Published: (2025)
Gender Encoding Patterns in Pretrained Language Model Representations
by: Zakizadeh, Mahdi, et al.
Published: (2025)
by: Zakizadeh, Mahdi, et al.
Published: (2025)
Interpretable Emergent Language Using Inter-Agent Transformers
by: Bhardwaj, Mannan
Published: (2025)
by: Bhardwaj, Mannan
Published: (2025)
Credence Calibration Game? Calibrating Large Language Models through Structured Play
by: Fang, Ke, et al.
Published: (2025)
by: Fang, Ke, et al.
Published: (2025)
Trained Models Tell Us How to Make Them Robust to Spurious Correlation without Group Annotation
by: Ghaznavi, Mahdi, et al.
Published: (2024)
by: Ghaznavi, Mahdi, et al.
Published: (2024)
From Emergence to Control: Probing and Modulating Self-Reflection in Language Models
by: Zhu, Xudong, et al.
Published: (2025)
by: Zhu, Xudong, et al.
Published: (2025)
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
by: Chhabra, Vishnu Kabir, et al.
Published: (2025)
by: Chhabra, Vishnu Kabir, et al.
Published: (2025)
Mastering Board Games by External and Internal Planning with Language Models
by: Schultz, John, et al.
Published: (2024)
by: Schultz, John, et al.
Published: (2024)
Emergent Languages in Populations of Language Model Agents: From Token Efficiency to Oversight Evasion
by: Beltoft, Stine Lyngsø, et al.
Published: (2026)
by: Beltoft, Stine Lyngsø, et al.
Published: (2026)
Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents
by: Shao, Shuai, et al.
Published: (2025)
by: Shao, Shuai, et al.
Published: (2025)
From Persona to Personalization: A Survey on Role-Playing Language Agents
by: Chen, Jiangjie, et al.
Published: (2024)
by: Chen, Jiangjie, et al.
Published: (2024)
Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration
by: Wang, Zhenhailong, et al.
Published: (2023)
by: Wang, Zhenhailong, et al.
Published: (2023)
Similar Items
-
Inductive Biases for Zero-shot Systematic Generalization in Language-informed Reinforcement Learning
by: Dijujin, Negin Hashemi, et al.
Published: (2025) -
The Illusion of Procedural Reasoning: Measuring Long-Horizon FSM Execution in LLMs
by: Samiei, Mahdi, et al.
Published: (2025) -
Bridging Reasoning to Learning: Unmasking Illusions using Complexity Out of Distribution Generalization
by: Paqaleh, Mohammad Mahdi Samiei, et al.
Published: (2025) -
Language Plays a Pivotal Role in the Object-Attribute Compositional Generalization of CLIP
by: Abbasi, Reza, et al.
Published: (2024) -
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
by: Abootorabi, Mohammad Mahdi, et al.
Published: (2025)