:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Sato, Takehiro, Ozaki, Shintaro, Yokoyama, Daisaku
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2409.01575
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Strategy Adaptation in Large Language Model Werewolf Agents
by: Nakamori, Fuya, et al.
Published: (2025)

Verbal Werewolf: Engage Users with Verbalized Agentic Werewolf Game Framework
by: Fan, Qihui, et al.
Published: (2025)

Enhance Reasoning for Large Language Models in the Game Werewolf
by: Wu, Shuang, et al.
Published: (2024)

Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf
by: Xu, Yuzhuang, et al.
Published: (2023)

Identifying Influential N-grams in Confidence Calibration via Regression Analysis
by: Ozaki, Shintaro, et al.
Published: (2026)

Relevant or Random: Can LLMs Truly Perform Analogical Reasoning?
by: Qin, Chengwei, et al.
Published: (2024)

WereWolf-Plus: An Update of Werewolf Game setting Based on DSGBench
by: Xia, Xinyuan, et al.
Published: (2025)

Enhancing Consistency of Werewolf AI through Dialogue Summarization and Persona Information
by: Tanaka, Yoshiki, et al.
Published: (2026)

Diagnosing Vision Language Models' Perception by Leveraging Human Methods for Color Vision Deficiencies
by: Hayashi, Kazuki, et al.
Published: (2025)

Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies
by: Qi, Zhiyang, et al.
Published: (2024)

Werewolf Arena: A Case Study in LLM Evaluation via Social Deduction
by: Bailis, Suma, et al.
Published: (2024)

Helmsman of the Masses? Evaluate the Opinion Leadership of Large Language Models in the Werewolf Game
by: Du, Silin, et al.
Published: (2024)

BQA: Body Language Question Answering Dataset for Video Large Language Models
by: Ozaki, Shintaro, et al.
Published: (2024)

Towards Cross-Lingual Explanation of Artwork in Large-scale Vision Language Models
by: Ozaki, Shintaro, et al.
Published: (2024)

The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead?
by: Choi, Alexander S., et al.
Published: (2024)

Do LLMs Truly Benefit from Longer Context in Automatic Post-Editing?
by: Kim, Ahrii, et al.
Published: (2026)

Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance
by: Ozaki, Shintaro, et al.
Published: (2025)

Demystifying Hybrid Thinking: Can LLMs Truly Switch Between Think and No-Think?
by: Wang, Shouren, et al.
Published: (2025)

Towards Trust Calibration in Socially Interactive Agents: Investigating Gendered Multimodal Behaviors Generation with LLMs
by: Galland, Lucie, et al.
Published: (2026)

Decoding AI Authorship: Can LLMs Truly Mimic Human Style Across Literature and Politics?
by: Alsadhan, Nasser A
Published: (2026)

Somatic in the East, Psychological in the West?: Investigating Clinically-Grounded Cross-Cultural Depression Symptom Expression in LLMs
by: Sakai, Shintaro, et al.
Published: (2025)

Can AI Truly Represent Your Voice in Deliberations? A Comprehensive Study of Large-Scale Opinion Aggregation with LLMs
by: Zhu, Shenzhe, et al.
Published: (2025)

Can LLMs Truly Embody Human Personality? Analyzing AI and Human Behavior Alignment in Dispute Resolution
by: Kwon, Deuksin, et al.
Published: (2026)

In Agents We Trust, but Who Do Agents Trust? Latent Source Preferences Steer LLM Generations
by: Khan, Mohammad Aflah, et al.
Published: (2026)

How does a Language-Specific Tokenizer affect LLMs?
by: Seo, Jean, et al.
Published: (2025)

AraTrust: An Evaluation of Trustworthiness for LLMs in Arabic
by: Alghamdi, Emad A., et al.
Published: (2024)

ParsTranslit: Truly Versatile Tajik-Farsi Transliteration
by: Merchant, Rayyan, et al.
Published: (2025)

Do LLMs Truly Understand When a Precedent Is Overruled?
by: Zhang, Li, et al.
Published: (2025)

Are LLMs Truly Multilingual? Exploring Zero-Shot Multilingual Capability of LLMs for Information Retrieval: An Italian Healthcare Use Case
by: Kembu, Vignesh Kumar, et al.
Published: (2025)

When to Trust LLMs: Aligning Confidence with Response Quality
by: Tao, Shuchang, et al.
Published: (2024)

Clinical knowledge in LLMs does not translate to human interactions
by: Bean, Andrew M., et al.
Published: (2025)

Do Large Language Models Truly Understand Geometric Structures?
by: Wang, Xiaofeng, et al.
Published: (2025)

Beyond Film Subtitles: Is YouTube the Best Approximation of Spoken Vocabulary?
by: Nohejl, Adam, et al.
Published: (2024)

Can We Trust LLMs? Mitigate Overconfidence Bias in LLMs through Knowledge Transfer
by: Yang, Haoyan, et al.
Published: (2024)

TextTIGER: Text-based Intelligent Generation with Entity Prompt Refinement for Text-to-Image Generation
by: Ozaki, Shintaro, et al.
Published: (2025)

Trust, Safety, and Accuracy: Assessing LLMs for Routine Maternity Advice
by: Divya, V Sai, et al.
Published: (2026)

Are Large Language Models Truly Smarter Than Humans?
by: M, Eshwar Reddy, et al.
Published: (2026)

Do Large Language Models Truly Understand Cross-cultural Differences?
by: Guo, Shiwei, et al.
Published: (2025)

Building Trust in Clinical LLMs: Bias Analysis and Dataset Transparency
by: Maslenkova, Svetlana, et al.
Published: (2025)

Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate
by: Chern, Steffi, et al.
Published: (2024)