Saved in:
| Main Authors: | Wu, Yuning, Mei, Jiahao, Yan, Ming, Li, Chenliang, Lai, Shaopeng, Ren, Yuran, Wang, Zijia, Zhang, Ji, Wu, Mengyue, Jin, Qin, Huang, Fei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.05244 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio
by: Xu, Xuenan, et al.
Published: (2025)
by: Xu, Xuenan, et al.
Published: (2025)
R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning
by: Liu, Wanlong, et al.
Published: (2026)
by: Liu, Wanlong, et al.
Published: (2026)
Writing-RL: Advancing Long-form Writing via Adaptive Curriculum Reinforcement Learning
by: Lei, Xuanyu, et al.
Published: (2025)
by: Lei, Xuanyu, et al.
Published: (2025)
QwenLong-CPRS: Towards $\infty$-LLMs with Dynamic Context Optimization
by: Shen, Weizhou, et al.
Published: (2025)
by: Shen, Weizhou, et al.
Published: (2025)
LitBench: A Benchmark and Dataset for Reliable Evaluation of Creative Writing
by: Fein, Daniel, et al.
Published: (2025)
by: Fein, Daniel, et al.
Published: (2025)
HoWToBench: Holistic Evaluation for LLM's Capability in Human-level Writing using Tree of Writing
by: Feng, Andrew Zhuoer, et al.
Published: (2026)
by: Feng, Andrew Zhuoer, et al.
Published: (2026)
From Coarse to Fine: Benchmarking and Reward Modeling for Writing-Centric Generation Tasks
by: Ren, Qingyu, et al.
Published: (2026)
by: Ren, Qingyu, et al.
Published: (2026)
CorpusQA: A 10 Million Token Benchmark for Corpus-Level Analysis and Reasoning
by: Lu, Zhiyuan, et al.
Published: (2026)
by: Lu, Zhiyuan, et al.
Published: (2026)
SocialBench: Sociality Evaluation of Role-Playing Conversational Agents
by: Chen, Hongzhan, et al.
Published: (2024)
by: Chen, Hongzhan, et al.
Published: (2024)
SurveyBench: Can LLM(-Agents) Write Academic Surveys that Align with Reader Needs?
by: Sun, Zhaojun, et al.
Published: (2025)
by: Sun, Zhaojun, et al.
Published: (2025)
EssayBench: Evaluating Large Language Models in Multi-Genre Chinese Essay Writing
by: Gao, Fan, et al.
Published: (2025)
by: Gao, Fan, et al.
Published: (2025)
Persona-Augmented Benchmarking: Evaluating LLMs Across Diverse Writing Styles
by: Truong, Kimberly Le, et al.
Published: (2025)
by: Truong, Kimberly Le, et al.
Published: (2025)
Writer-R1: Enhancing Generative Writing in LLMs via Memory-augmented Replay Policy Optimization
by: Zhao, Jihao, et al.
Published: (2026)
by: Zhao, Jihao, et al.
Published: (2026)
CreativeBench: Benchmarking and Enhancing Machine Creativity via Self-Evolving Challenges
by: Wang, Zi-Han, et al.
Published: (2026)
by: Wang, Zi-Han, et al.
Published: (2026)
Writing Your Heritage: A Sequence of Thinking, Reading, and Writing Assignments. Writing Teachers at Work.
by: Dixon, Deborah
Published: (1993)
by: Dixon, Deborah
Published: (1993)
Let's Use ChatGPT To Write Our Paper! Benchmarking LLMs To Write the Introduction of a Research Paper
by: Garg, Krishna, et al.
Published: (2025)
by: Garg, Krishna, et al.
Published: (2025)
OphthBench: A Comprehensive Benchmark for Evaluating Large Language Models in Chinese Ophthalmology
by: Zhou, Chengfeng, et al.
Published: (2025)
by: Zhou, Chengfeng, et al.
Published: (2025)
Help Me Write a Story: Evaluating LLMs' Ability to Generate Writing Feedback
by: Rashkin, Hannah, et al.
Published: (2025)
by: Rashkin, Hannah, et al.
Published: (2025)
JMedBench: A Benchmark for Evaluating Japanese Biomedical Large Language Models
by: Jiang, Junfeng, et al.
Published: (2024)
by: Jiang, Junfeng, et al.
Published: (2024)
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management
by: Shen, Weizhou, et al.
Published: (2025)
by: Shen, Weizhou, et al.
Published: (2025)
A Human-Centric Pipeline for Aligning Large Language Models with Chinese Medical Ethics
by: Jin, Haoan, et al.
Published: (2026)
by: Jin, Haoan, et al.
Published: (2026)
Meow: End-to-End Outline Writing for Automatic Academic Survey
by: Ma, Zhaoyu, et al.
Published: (2025)
by: Ma, Zhaoyu, et al.
Published: (2025)
Philosophy of Writing
by: Arndt, David
Published: (2026)
by: Arndt, David
Published: (2026)
LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs
by: Wu, Yuhao, et al.
Published: (2024)
by: Wu, Yuhao, et al.
Published: (2024)
OmniEduBench: A Comprehensive Chinese Benchmark for Evaluating Large Language Models in Education
by: Zhang, Min, et al.
Published: (2025)
by: Zhang, Min, et al.
Published: (2025)
CC-GSEO-Bench: A Content-Centric Benchmark for Measuring Source Influence in Generative Search Engines
by: Chen, Qiyuan, et al.
Published: (2025)
by: Chen, Qiyuan, et al.
Published: (2025)
Synthetic or Authentic? Building Mental Patient Simulators from Longitudinal Evidence
by: Li, Baihan, et al.
Published: (2026)
by: Li, Baihan, et al.
Published: (2026)
SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation
by: Liu, Ruohan, et al.
Published: (2026)
by: Liu, Ruohan, et al.
Published: (2026)
IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance
by: Röttger, Paul, et al.
Published: (2025)
by: Röttger, Paul, et al.
Published: (2025)
FormalProofBench: Can Models Write Graduate Level Math Proofs That Are Formally Verified?
by: Ravi, Nikil, et al.
Published: (2026)
by: Ravi, Nikil, et al.
Published: (2026)
Neural Automated Writing Evaluation with Corrective Feedback
by: Wang, Izia Xiaoxiao, et al.
Published: (2024)
by: Wang, Izia Xiaoxiao, et al.
Published: (2024)
A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction Based on Large Language Models
by: Zhou, Houquan, et al.
Published: (2024)
by: Zhou, Houquan, et al.
Published: (2024)
Learning to Generate Text in Arbitrary Writing Styles
by: Khan, Aleem, et al.
Published: (2023)
by: Khan, Aleem, et al.
Published: (2023)
Temporal Flattening in LLM-Generated Text: Comparing Human and LLM Writing Trajectories
by: Cao, Zhanwei, et al.
Published: (2026)
by: Cao, Zhanwei, et al.
Published: (2026)
Small LLMs Are Weak Tool Learners: A Multi-LLM Agent
by: Shen, Weizhou, et al.
Published: (2024)
by: Shen, Weizhou, et al.
Published: (2024)
Step-Back Profiling: Distilling User History for Personalized Scientific Writing
by: Tang, Xiangru, et al.
Published: (2024)
by: Tang, Xiangru, et al.
Published: (2024)
Writing Development in Struggling Learners
Published: (2022)
Published: (2022)
The Use of Basic Writing Materials in ESL Writing Classes.
by: England, Lizabeth
Published: (1984)
by: England, Lizabeth
Published: (1984)
DECOR: Improving Coherence in L2 English Writing with a Novel Benchmark for Incoherence Detection, Reasoning, and Rewriting
by: Zhang, Xuanming, et al.
Published: (2024)
by: Zhang, Xuanming, et al.
Published: (2024)
BenchBench: Benchmarking Automated Benchmark Generation
by: Zheng, Yandan, et al.
Published: (2026)
by: Zheng, Yandan, et al.
Published: (2026)
Similar Items
-
MM-StoryAgent: Immersive Narrated Storybook Video Generation with a Multi-Agent Paradigm across Text, Image and Audio
by: Xu, Xuenan, et al.
Published: (2025) -
R2-Write: Reflection and Revision for Open-Ended Writing with Deep Reasoning
by: Liu, Wanlong, et al.
Published: (2026) -
Writing-RL: Advancing Long-form Writing via Adaptive Curriculum Reinforcement Learning
by: Lei, Xuanyu, et al.
Published: (2025) -
QwenLong-CPRS: Towards $\infty$-LLMs with Dynamic Context Optimization
by: Shen, Weizhou, et al.
Published: (2025) -
LitBench: A Benchmark and Dataset for Reliable Evaluation of Creative Writing
by: Fein, Daniel, et al.
Published: (2025)