Saved in:
| Main Authors: | Tan, Weihao, Jiang, Changjiu, Duan, Yu, Lei, Mingcong, Li, Jiageng, Hong, Yitian, Wang, Xinrun, An, Bo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.07445 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter
by: Park, Junyeong, et al.
Published: (2025)
by: Park, Junyeong, et al.
Published: (2025)
JaxLife: An Open-Ended Agentic Simulator
by: Lu, Chris, et al.
Published: (2024)
by: Lu, Chris, et al.
Published: (2024)
Democratizing Game Modding with GenAI: A Case Study of StarCharM, a Stardew Valley Character Maker
by: Miralvand, Hamid Zand, et al.
Published: (2025)
by: Miralvand, Hamid Zand, et al.
Published: (2025)
True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning
by: Tan, Weihao, et al.
Published: (2024)
by: Tan, Weihao, et al.
Published: (2024)
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
by: Xie, Tianbao, et al.
Published: (2024)
by: Xie, Tianbao, et al.
Published: (2024)
PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts
by: Li, Hengzhi, et al.
Published: (2025)
by: Li, Hengzhi, et al.
Published: (2025)
KASER: Knowledge-Aligned Student Error Simulator for Open-Ended Coding Tasks
by: Duan, Zhangqi, et al.
Published: (2026)
by: Duan, Zhangqi, et al.
Published: (2026)
MovieRecapsQA: A Multimodal Open-Ended Video Question-Answering Benchmark
by: Shaar, Shaden, et al.
Published: (2026)
by: Shaar, Shaden, et al.
Published: (2026)
CrafText Benchmark: Advancing Instruction Following in Complex Multimodal Open-Ended World
by: Volovikova, Zoya, et al.
Published: (2025)
by: Volovikova, Zoya, et al.
Published: (2025)
Grading Open‐Ended Questions Using LLMs and RAG
by: Jacobo Farray Rodríguez, et al.
Published: (2025)
by: Jacobo Farray Rodríguez, et al.
Published: (2025)
DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning
by: Xu, Xinrun, et al.
Published: (2025)
by: Xu, Xinrun, et al.
Published: (2025)
OpenEP: Open-Ended Future Event Prediction
by: Guan, Yong, et al.
Published: (2024)
by: Guan, Yong, et al.
Published: (2024)
Open-Ended Video Game Glitch Detection with Agentic Reasoning and Temporal Grounding
by: Zheng, Muyang, et al.
Published: (2026)
by: Zheng, Muyang, et al.
Published: (2026)
How Intrinsic Motivation Underlies Embodied Open-Ended Behavior
by: Moreno-Bote, Rubén, et al.
Published: (2026)
by: Moreno-Bote, Rubén, et al.
Published: (2026)
Dojo: A Differentiable Physics Engine for Robotics
by: Howell, Taylor A., et al.
Published: (2022)
by: Howell, Taylor A., et al.
Published: (2022)
Can LLMs Reliably Simulate Human Learner Actions? A Simulation Authoring Framework for Open-Ended Learning Environments
by: Mannekote, Amogh, et al.
Published: (2024)
by: Mannekote, Amogh, et al.
Published: (2024)
Agentic Learner with Grow-and-Refine Multimodal Semantic Memory
by: Bo, Weihao, et al.
Published: (2025)
by: Bo, Weihao, et al.
Published: (2025)
Generating Planning Feedback for Open-Ended Programming Exercises with LLMs
by: Demirtaş, Mehmet Arif, et al.
Published: (2025)
by: Demirtaş, Mehmet Arif, et al.
Published: (2025)
Quality Control in Open-Ended Crowdsourcing: A Survey
by: Chai, Lei, et al.
Published: (2024)
by: Chai, Lei, et al.
Published: (2024)
On Creativity and Open-Endedness
by: Soros, L. B., et al.
Published: (2024)
by: Soros, L. B., et al.
Published: (2024)
R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning
by: Liu, Wanlong, et al.
Published: (2026)
by: Liu, Wanlong, et al.
Published: (2026)
LeakDojo: Decoding the Leakage Threats of RAG Systems
by: Zhang, Maosen, et al.
Published: (2026)
by: Zhang, Maosen, et al.
Published: (2026)
The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation
by: Carlsson, Fredrik, et al.
Published: (2024)
by: Carlsson, Fredrik, et al.
Published: (2024)
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning
by: Matthews, Michael, et al.
Published: (2024)
by: Matthews, Michael, et al.
Published: (2024)
MORQA: Benchmarking Evaluation Metrics for Medical Open-Ended Question Answering
by: Yim, Wen-wai, et al.
Published: (2025)
by: Yim, Wen-wai, et al.
Published: (2025)
Too Open for Opinion? Embracing Open-Endedness in Large Language Models for Social Simulation
by: Ma, Bolei, et al.
Published: (2025)
by: Ma, Bolei, et al.
Published: (2025)
MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering
by: Qiang, Rushi, et al.
Published: (2025)
by: Qiang, Rushi, et al.
Published: (2025)
Enabling Adaptive Agent Training in Open-Ended Simulators by Targeting Diversity
by: Costales, Robby, et al.
Published: (2024)
by: Costales, Robby, et al.
Published: (2024)
A cross-modal pre-training framework with video data for improving performance and generalization of distributed acoustic sensing
by: Duan, Junyi, et al.
Published: (2025)
by: Duan, Junyi, et al.
Published: (2025)
DAS-MAE: A self-supervised pre-training framework for universal and high-performance representation learning of distributed fiber-optic acoustic sensing
by: Duan, Junyi, et al.
Published: (2025)
by: Duan, Junyi, et al.
Published: (2025)
ChinaTravel: An Open-Ended Travel Planning Benchmark with Compositional Constraint Validation for Language Agents
by: Shao, Jie-Jing, et al.
Published: (2024)
by: Shao, Jie-Jing, et al.
Published: (2024)
3DTTNet: Multimodal Fusion-Based 3D Traversable Terrain Modeling for Off-Road Environments
by: Chen, Zitong, et al.
Published: (2024)
by: Chen, Zitong, et al.
Published: (2024)
PerfDojo: Automated ML Library Generation for Heterogeneous Architectures
by: Ivanov, Andrei, et al.
Published: (2025)
by: Ivanov, Andrei, et al.
Published: (2025)
BeamDojo: Learning Agile Humanoid Locomotion on Sparse Footholds
by: Wang, Huayi, et al.
Published: (2025)
by: Wang, Huayi, et al.
Published: (2025)
Training Language Model Agents to Find Vulnerabilities with CTF-Dojo
by: Zhuo, Terry Yue, et al.
Published: (2025)
by: Zhuo, Terry Yue, et al.
Published: (2025)
CausalEvolve: Towards Open-Ended Discovery with Causal Scratchpad
by: Chen, Yongqiang, et al.
Published: (2026)
by: Chen, Yongqiang, et al.
Published: (2026)
StereoTales: A Multilingual Framework for Open-Ended Stereotype Discovery in LLMs
by: Jeune, Pierre Le, et al.
Published: (2026)
by: Jeune, Pierre Le, et al.
Published: (2026)
Unveiling the Potential of Vision-Language-Action Models with Open-Ended Multimodal Instructions
by: Zhao, Wei, et al.
Published: (2025)
by: Zhao, Wei, et al.
Published: (2025)
GUIDE: A Benchmark for Understanding and Assisting Users in Open-Ended GUI Tasks
by: Yang, Saelyne, et al.
Published: (2026)
by: Yang, Saelyne, et al.
Published: (2026)
Assessing Bias in Metric Models for LLM Open-Ended Generation Bias Benchmarks
by: Demchak, Nathaniel, et al.
Published: (2024)
by: Demchak, Nathaniel, et al.
Published: (2024)
Similar Items
-
CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter
by: Park, Junyeong, et al.
Published: (2025) -
JaxLife: An Open-Ended Agentic Simulator
by: Lu, Chris, et al.
Published: (2024) -
Democratizing Game Modding with GenAI: A Case Study of StarCharM, a Stardew Valley Character Maker
by: Miralvand, Hamid Zand, et al.
Published: (2025) -
True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning
by: Tan, Weihao, et al.
Published: (2024) -
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
by: Xie, Tianbao, et al.
Published: (2024)