Saved in:
| Main Authors: | Sato, Takehiro, Ozaki, Shintaro, Yokoyama, Daisaku |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.01575 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Strategy Adaptation in Large Language Model Werewolf Agents
by: Nakamori, Fuya, et al.
Published: (2025)
by: Nakamori, Fuya, et al.
Published: (2025)
Verbal Werewolf: Engage Users with Verbalized Agentic Werewolf Game Framework
by: Fan, Qihui, et al.
Published: (2025)
by: Fan, Qihui, et al.
Published: (2025)
Enhance Reasoning for Large Language Models in the Game Werewolf
by: Wu, Shuang, et al.
Published: (2024)
by: Wu, Shuang, et al.
Published: (2024)
Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf
by: Xu, Yuzhuang, et al.
Published: (2023)
by: Xu, Yuzhuang, et al.
Published: (2023)
Identifying Influential N-grams in Confidence Calibration via Regression Analysis
by: Ozaki, Shintaro, et al.
Published: (2026)
by: Ozaki, Shintaro, et al.
Published: (2026)
Relevant or Random: Can LLMs Truly Perform Analogical Reasoning?
by: Qin, Chengwei, et al.
Published: (2024)
by: Qin, Chengwei, et al.
Published: (2024)
WereWolf-Plus: An Update of Werewolf Game setting Based on DSGBench
by: Xia, Xinyuan, et al.
Published: (2025)
by: Xia, Xinyuan, et al.
Published: (2025)
Enhancing Consistency of Werewolf AI through Dialogue Summarization and Persona Information
by: Tanaka, Yoshiki, et al.
Published: (2026)
by: Tanaka, Yoshiki, et al.
Published: (2026)
Diagnosing Vision Language Models' Perception by Leveraging Human Methods for Color Vision Deficiencies
by: Hayashi, Kazuki, et al.
Published: (2025)
by: Hayashi, Kazuki, et al.
Published: (2025)
Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies
by: Qi, Zhiyang, et al.
Published: (2024)
by: Qi, Zhiyang, et al.
Published: (2024)
Werewolf Arena: A Case Study in LLM Evaluation via Social Deduction
by: Bailis, Suma, et al.
Published: (2024)
by: Bailis, Suma, et al.
Published: (2024)
Helmsman of the Masses? Evaluate the Opinion Leadership of Large Language Models in the Werewolf Game
by: Du, Silin, et al.
Published: (2024)
by: Du, Silin, et al.
Published: (2024)
BQA: Body Language Question Answering Dataset for Video Large Language Models
by: Ozaki, Shintaro, et al.
Published: (2024)
by: Ozaki, Shintaro, et al.
Published: (2024)
Towards Cross-Lingual Explanation of Artwork in Large-scale Vision Language Models
by: Ozaki, Shintaro, et al.
Published: (2024)
by: Ozaki, Shintaro, et al.
Published: (2024)
The LLM Effect: Are Humans Truly Using LLMs, or Are They Being Influenced By Them Instead?
by: Choi, Alexander S., et al.
Published: (2024)
by: Choi, Alexander S., et al.
Published: (2024)
Do LLMs Truly Benefit from Longer Context in Automatic Post-Editing?
by: Kim, Ahrii, et al.
Published: (2026)
by: Kim, Ahrii, et al.
Published: (2026)
Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance
by: Ozaki, Shintaro, et al.
Published: (2025)
by: Ozaki, Shintaro, et al.
Published: (2025)
Demystifying Hybrid Thinking: Can LLMs Truly Switch Between Think and No-Think?
by: Wang, Shouren, et al.
Published: (2025)
by: Wang, Shouren, et al.
Published: (2025)
Towards Trust Calibration in Socially Interactive Agents: Investigating Gendered Multimodal Behaviors Generation with LLMs
by: Galland, Lucie, et al.
Published: (2026)
by: Galland, Lucie, et al.
Published: (2026)
Decoding AI Authorship: Can LLMs Truly Mimic Human Style Across Literature and Politics?
by: Alsadhan, Nasser A
Published: (2026)
by: Alsadhan, Nasser A
Published: (2026)
Somatic in the East, Psychological in the West?: Investigating Clinically-Grounded Cross-Cultural Depression Symptom Expression in LLMs
by: Sakai, Shintaro, et al.
Published: (2025)
by: Sakai, Shintaro, et al.
Published: (2025)
Can AI Truly Represent Your Voice in Deliberations? A Comprehensive Study of Large-Scale Opinion Aggregation with LLMs
by: Zhu, Shenzhe, et al.
Published: (2025)
by: Zhu, Shenzhe, et al.
Published: (2025)
Can LLMs Truly Embody Human Personality? Analyzing AI and Human Behavior Alignment in Dispute Resolution
by: Kwon, Deuksin, et al.
Published: (2026)
by: Kwon, Deuksin, et al.
Published: (2026)
In Agents We Trust, but Who Do Agents Trust? Latent Source Preferences Steer LLM Generations
by: Khan, Mohammad Aflah, et al.
Published: (2026)
by: Khan, Mohammad Aflah, et al.
Published: (2026)
How does a Language-Specific Tokenizer affect LLMs?
by: Seo, Jean, et al.
Published: (2025)
by: Seo, Jean, et al.
Published: (2025)
AraTrust: An Evaluation of Trustworthiness for LLMs in Arabic
by: Alghamdi, Emad A., et al.
Published: (2024)
by: Alghamdi, Emad A., et al.
Published: (2024)
ParsTranslit: Truly Versatile Tajik-Farsi Transliteration
by: Merchant, Rayyan, et al.
Published: (2025)
by: Merchant, Rayyan, et al.
Published: (2025)
Do LLMs Truly Understand When a Precedent Is Overruled?
by: Zhang, Li, et al.
Published: (2025)
by: Zhang, Li, et al.
Published: (2025)
Are LLMs Truly Multilingual? Exploring Zero-Shot Multilingual Capability of LLMs for Information Retrieval: An Italian Healthcare Use Case
by: Kembu, Vignesh Kumar, et al.
Published: (2025)
by: Kembu, Vignesh Kumar, et al.
Published: (2025)
When to Trust LLMs: Aligning Confidence with Response Quality
by: Tao, Shuchang, et al.
Published: (2024)
by: Tao, Shuchang, et al.
Published: (2024)
Clinical knowledge in LLMs does not translate to human interactions
by: Bean, Andrew M., et al.
Published: (2025)
by: Bean, Andrew M., et al.
Published: (2025)
Do Large Language Models Truly Understand Geometric Structures?
by: Wang, Xiaofeng, et al.
Published: (2025)
by: Wang, Xiaofeng, et al.
Published: (2025)
Beyond Film Subtitles: Is YouTube the Best Approximation of Spoken Vocabulary?
by: Nohejl, Adam, et al.
Published: (2024)
by: Nohejl, Adam, et al.
Published: (2024)
Can We Trust LLMs? Mitigate Overconfidence Bias in LLMs through Knowledge Transfer
by: Yang, Haoyan, et al.
Published: (2024)
by: Yang, Haoyan, et al.
Published: (2024)
TextTIGER: Text-based Intelligent Generation with Entity Prompt Refinement for Text-to-Image Generation
by: Ozaki, Shintaro, et al.
Published: (2025)
by: Ozaki, Shintaro, et al.
Published: (2025)
Trust, Safety, and Accuracy: Assessing LLMs for Routine Maternity Advice
by: Divya, V Sai, et al.
Published: (2026)
by: Divya, V Sai, et al.
Published: (2026)
Are Large Language Models Truly Smarter Than Humans?
by: M, Eshwar Reddy, et al.
Published: (2026)
by: M, Eshwar Reddy, et al.
Published: (2026)
Do Large Language Models Truly Understand Cross-cultural Differences?
by: Guo, Shiwei, et al.
Published: (2025)
by: Guo, Shiwei, et al.
Published: (2025)
Building Trust in Clinical LLMs: Bias Analysis and Dataset Transparency
by: Maslenkova, Svetlana, et al.
Published: (2025)
by: Maslenkova, Svetlana, et al.
Published: (2025)
Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate
by: Chern, Steffi, et al.
Published: (2024)
by: Chern, Steffi, et al.
Published: (2024)
Similar Items
-
Strategy Adaptation in Large Language Model Werewolf Agents
by: Nakamori, Fuya, et al.
Published: (2025) -
Verbal Werewolf: Engage Users with Verbalized Agentic Werewolf Game Framework
by: Fan, Qihui, et al.
Published: (2025) -
Enhance Reasoning for Large Language Models in the Game Werewolf
by: Wu, Shuang, et al.
Published: (2024) -
Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf
by: Xu, Yuzhuang, et al.
Published: (2023) -
Identifying Influential N-grams in Confidence Calibration via Regression Analysis
by: Ozaki, Shintaro, et al.
Published: (2026)