Gu, Z., Zhang, L., Chen, J., Ye, H., Zhu, X., Li, Z., . . . Feng, H. (2023). Piecing Together Clues: A Benchmark for Evaluating the Detective Skills of Large Language Models.
Chicago Style (17th ed.) CitationGu, Zhouhong, et al. Piecing Together Clues: A Benchmark for Evaluating the Detective Skills of Large Language Models. 2023.
MLA (9th ed.) CitationGu, Zhouhong, et al. Piecing Together Clues: A Benchmark for Evaluating the Detective Skills of Large Language Models. 2023.
Warning: These citations may not always be 100% accurate.