Huang, J., Cheng, F., Jiang, J., Yu, Z., & Aizawa, A. (2026). BenchTrace: A Benchmark for Testing Reflection Ability and Controlled Evolution in LLM Agents.
Chicago Style (17th ed.) CitationHuang, Jiahao, Fei Cheng, Junfeng Jiang, Zefan Yu, and Akiko Aizawa. BenchTrace: A Benchmark for Testing Reflection Ability and Controlled Evolution in LLM Agents. 2026.
MLA (9th ed.) CitationHuang, Jiahao, et al. BenchTrace: A Benchmark for Testing Reflection Ability and Controlled Evolution in LLM Agents. 2026.
Warning: These citations may not always be 100% accurate.