Saved in:
| Main Authors: | Oprea, David, Powers, Sam |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.07308 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Assessing Bias in Metric Models for LLM Open-Ended Generation Bias Benchmarks
by: Demchak, Nathaniel, et al.
Published: (2024)
by: Demchak, Nathaniel, et al.
Published: (2024)
AutoLibra: Agent Metric Induction from Open-Ended Human Feedback
by: Zhu, Hao, et al.
Published: (2025)
by: Zhu, Hao, et al.
Published: (2025)
Reverse-Engineered Reasoning for Open-Ended Generation
by: Wang, Haozhe, et al.
Published: (2025)
by: Wang, Haozhe, et al.
Published: (2025)
Open-Ended Wargames with Large Language Models
by: Hogan, Daniel P., et al.
Published: (2024)
by: Hogan, Daniel P., et al.
Published: (2024)
Towards Open-Ended Discovery for Low-Resource NLP
by: Dossou, Bonaventure F. P., et al.
Published: (2025)
by: Dossou, Bonaventure F. P., et al.
Published: (2025)
Generating Planning Feedback for Open-Ended Programming Exercises with LLMs
by: Demirtaş, Mehmet Arif, et al.
Published: (2025)
by: Demirtaş, Mehmet Arif, et al.
Published: (2025)
Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation
by: Zhou, Yuxuan, et al.
Published: (2024)
by: Zhou, Yuxuan, et al.
Published: (2024)
In Search of the Ingredients of Open-Endedness: Replicating Picbreeder with Large Vision-Language Models
by: Earle, Sam, et al.
Published: (2026)
by: Earle, Sam, et al.
Published: (2026)
MIRROR: A Novel Approach for the Automated Evaluation of Open-Ended Question Generation
by: Deroy, Aniket, et al.
Published: (2024)
by: Deroy, Aniket, et al.
Published: (2024)
Generation Space Size: Understanding and Calibrating Open-Endedness of LLM Generations
by: Yu, Sunny, et al.
Published: (2025)
by: Yu, Sunny, et al.
Published: (2025)
The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation
by: Carlsson, Fredrik, et al.
Published: (2024)
by: Carlsson, Fredrik, et al.
Published: (2024)
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
by: Lu, Chris, et al.
Published: (2024)
by: Lu, Chris, et al.
Published: (2024)
O$^2$-Searcher: A Searching-based Agent Model for Open-Domain Open-Ended Question Answering
by: Mei, Jianbiao, et al.
Published: (2025)
by: Mei, Jianbiao, et al.
Published: (2025)
MATEval: A Multi-Agent Discussion Framework for Advancing Open-Ended Text Evaluation
by: Li, Yu, et al.
Published: (2024)
by: Li, Yu, et al.
Published: (2024)
StereoTales: A Multilingual Framework for Open-Ended Stereotype Discovery in LLMs
by: Jeune, Pierre Le, et al.
Published: (2026)
by: Jeune, Pierre Le, et al.
Published: (2026)
Dreaming in Code for Curriculum Learning in Open-Ended Worlds
by: Mitsides, Konstantinos, et al.
Published: (2026)
by: Mitsides, Konstantinos, et al.
Published: (2026)
Transparent Reference-free Automated Evaluation of Open-Ended User Survey Responses
by: An, Subin, et al.
Published: (2025)
by: An, Subin, et al.
Published: (2025)
GuessingGame: Measuring the Informativeness of Open-Ended Questions in Large Language Models
by: Hutson, Dylan, et al.
Published: (2025)
by: Hutson, Dylan, et al.
Published: (2025)
Decoding Open-Ended Information Seeking Goals from Eye Movements in Reading
by: Hadar, Cfir Avraham, et al.
Published: (2025)
by: Hadar, Cfir Avraham, et al.
Published: (2025)
AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research
by: Li, Yishan, et al.
Published: (2026)
by: Li, Yishan, et al.
Published: (2026)
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
by: Xie, Tianbao, et al.
Published: (2024)
by: Xie, Tianbao, et al.
Published: (2024)
R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning
by: Liu, Wanlong, et al.
Published: (2026)
by: Liu, Wanlong, et al.
Published: (2026)
AHP-Powered LLM Reasoning for Multi-Criteria Evaluation of Open-Ended Responses
by: Lu, Xiaotian, et al.
Published: (2024)
by: Lu, Xiaotian, et al.
Published: (2024)
ModeX: Evaluator-Free Best-of-N Selection for Open-Ended Generation
by: Choi, Hyeong Kyu, et al.
Published: (2026)
by: Choi, Hyeong Kyu, et al.
Published: (2026)
Preference learning in shades of gray: Interpretable and bias-aware reward modeling for human preferences
by: Oprea, Simona-Vasilica, et al.
Published: (2026)
by: Oprea, Simona-Vasilica, et al.
Published: (2026)
A Mixture-of-Experts Approach to Few-Shot Task Transfer in Open-Ended Text Worlds
by: Cui, Christopher Z., et al.
Published: (2024)
by: Cui, Christopher Z., et al.
Published: (2024)
AEL: Agent Evolving Learning for Open-Ended Environments
by: Xu, Wujiang, et al.
Published: (2026)
by: Xu, Wujiang, et al.
Published: (2026)
DIVERGE: Diversity-Enhanced RAG for Open-Ended Information Seeking
by: Hu, Tianyi, et al.
Published: (2026)
by: Hu, Tianyi, et al.
Published: (2026)
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
by: Samvelyan, Mikayel, et al.
Published: (2024)
by: Samvelyan, Mikayel, et al.
Published: (2024)
PuzzleWorld: A Benchmark for Multimodal, Open-Ended Reasoning in Puzzlehunts
by: Li, Hengzhi, et al.
Published: (2025)
by: Li, Hengzhi, et al.
Published: (2025)
A Semantic-Sampling Framework for Evaluating Calibration in Open-Ended Question Answering
by: Wang, Zhanliang, et al.
Published: (2026)
by: Wang, Zhanliang, et al.
Published: (2026)
From General to Targeted Rewards: Surpassing GPT-4 in Open-Ended Long-Context Generation
by: Guo, Zhihan, et al.
Published: (2025)
by: Guo, Zhihan, et al.
Published: (2025)
An Overview and Discussion on Using Large Language Models for Implementation Generation of Solutions to Open-Ended Problems
by: Shaik, Hashmath, et al.
Published: (2024)
by: Shaik, Hashmath, et al.
Published: (2024)
ChinaTravel: An Open-Ended Travel Planning Benchmark with Compositional Constraint Validation for Language Agents
by: Shao, Jie-Jing, et al.
Published: (2024)
by: Shao, Jie-Jing, et al.
Published: (2024)
Self-Rewarding Rubric-Based Reinforcement Learning for Open-Ended Reasoning
by: Ye, Zhiling, et al.
Published: (2025)
by: Ye, Zhiling, et al.
Published: (2025)
O-Researcher: An Open Ended Deep Research Model via Multi-Agent Distillation and Agentic RL
by: Yao, Yi, et al.
Published: (2026)
by: Yao, Yi, et al.
Published: (2026)
Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses
by: Amirizaniani, Maryam, et al.
Published: (2024)
by: Amirizaniani, Maryam, et al.
Published: (2024)
GTA-2: Benchmarking General Tool Agents from Atomic Tool-Use to Open-Ended Workflows
by: Wang, Jize, et al.
Published: (2026)
by: Wang, Jize, et al.
Published: (2026)
Tournament-GRPO: Group-Wise Tournament Rewards for Reinforcement Learning in Open-Ended Long-Form Generation
by: Yang, Zixuan, et al.
Published: (2026)
by: Yang, Zixuan, et al.
Published: (2026)
From National Curricula to Cultural Awareness: Constructing Open-Ended Culture-Specific Question Answering Dataset
by: Yoo, Haneul, et al.
Published: (2026)
by: Yoo, Haneul, et al.
Published: (2026)
Similar Items
-
Assessing Bias in Metric Models for LLM Open-Ended Generation Bias Benchmarks
by: Demchak, Nathaniel, et al.
Published: (2024) -
AutoLibra: Agent Metric Induction from Open-Ended Human Feedback
by: Zhu, Hao, et al.
Published: (2025) -
Reverse-Engineered Reasoning for Open-Ended Generation
by: Wang, Haozhe, et al.
Published: (2025) -
Open-Ended Wargames with Large Language Models
by: Hogan, Daniel P., et al.
Published: (2024) -
Towards Open-Ended Discovery for Low-Resource NLP
by: Dossou, Bonaventure F. P., et al.
Published: (2025)