:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ma, Xinbei, Ma, Ruotian, Chen, Xingyu, Shi, Zhengliang, Wang, Mengru, Huang, Jen-tse, Yang, Qu, Wang, Wenxuan, Ye, Fanghua, Jiang, Qingxuan, Zhou, Mengfei, Zhang, Zhuosheng, Wang, Rui, Zhao, Hai, Tu, Zhaopeng, Li, Xiaolong, Linus
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2509.26126
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Social Welfare Function Leaderboard: When LLM Agents Allocate Social Welfare
by: Shi, Zhengliang, et al.
Published: (2025)

Too Good to be Bad: On the Failure of LLMs to Role-Play Villains
by: Yi, Zihao, et al.
Published: (2025)

BatonVoice: An Operationalist Framework for Enhancing Controllable Speech Synthesis with Linguistic Intelligence from LLMs
by: Wang, Yue, et al.
Published: (2025)

Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models
by: Zhang, Bang, et al.
Published: (2025)

Insight Over Sight: Exploring the Vision-Knowledge Conflicts in Multimodal LLMs
by: Liu, Xiaoyuan, et al.
Published: (2024)

CoCo-Agent: A Comprehensive Cognitive MLLM Agent for Smartphone GUI Automation
by: Ma, Xinbei, et al.
Published: (2024)

On the Shortcut Learning in Multilingual Neural Machine Translation
by: Wang, Wenxuan, et al.
Published: (2024)

RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents
by: Wang, Peisong, et al.
Published: (2025)

Think Fast and Slow: Step-Level Cognitive Depth Adaptation for LLM Agents
by: Yang, Ruihan, et al.
Published: (2026)

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training
by: Wang, Mengru, et al.
Published: (2025)

SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
by: Chen, Jiaqi, et al.
Published: (2025)

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
by: Yuan, Youliang, et al.
Published: (2023)

Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step
by: Wang, Wenxuan, et al.
Published: (2024)

Can't See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs
by: Wang, Wenxuan, et al.
Published: (2025)

Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models
by: Wang, Wenxuan, et al.
Published: (2023)

All Languages Matter: On the Multilingual Safety of Large Language Models
by: Wang, Wenxuan, et al.
Published: (2023)

MEGen: Generative Backdoor into Large Language Models via Model Editing
by: Qiu, Jiyang, et al.
Published: (2024)

Plan-over-Graph: Towards Parallelable LLM Agent Schedule
by: Zhang, Shiqi, et al.
Published: (2025)

Chain-of-Trigger: An Agentic Backdoor that Paradoxically Enhances Agentic Robustness
by: Qiu, Jiyang, et al.
Published: (2025)

Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
by: Yuan, Youliang, et al.
Published: (2024)

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
by: Huang, Jen-tse, et al.
Published: (2024)

On the Robustness of Editing Large Language Models
by: Ma, Xinbei, et al.
Published: (2024)

Caution for the Environment: Multimodal LLM Agents are Susceptible to Environmental Distractions
by: Ma, Xinbei, et al.
Published: (2024)

On the Failure of Latent State Persistence in Large Language Models
by: Huang, Jen-tse, et al.
Published: (2025)

Agent-Dice: Disentangling Knowledge Updates via Geometric Consensus for Agent Continual Learning
by: Wu, Zheng, et al.
Published: (2026)

Emotionally Numb or Empathetic? Evaluating How LLMs Feel Using EmotionBench
by: Huang, Jen-tse, et al.
Published: (2023)

CogDual: Enhancing Dual Cognition of LLMs via Reinforcement Learning with Implicit Rule-Based Rewards
by: Liu, Cheng, et al.
Published: (2025)

How Deep is Love in LLMs' Hearts? Exploring Semantic Size in Human-like Cognition
by: Yao, Yao, et al.
Published: (2025)

Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench
by: Huang, Jen-tse, et al.
Published: (2023)

Houston Food Bank's Hunger Game Thrives on Competition
Published: (2024)

Identifying the Achilles' Heel: An Iterative Method for Dynamically Uncovering Factual Errors in Large Language Models
by: Wang, Wenxuan, et al.
Published: (2024)

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
by: Wang, Yue, et al.
Published: (2025)

ComboBench: Can LLMs Manipulate Physical Devices to Play Virtual Reality Games?
by: Li, Shuqing, et al.
Published: (2025)

RaSA: Rank-Sharing Low-Rank Adaptation
by: He, Zhiwei, et al.
Published: (2025)

Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model
by: He, Zhiwei, et al.
Published: (2024)

VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models
by: Huang, Jen-tse, et al.
Published: (2025)

AI Sees Your Location, But With A Bias Toward The Wealthy World
by: Huang, Jingyuan, et al.
Published: (2025)

The Lighthouse of Language: Enhancing LLM Agents via Critique-Guided Improvement
by: Yang, Ruihan, et al.
Published: (2025)

Understanding and Mitigating the Uncertainty in Zero-Shot Translation
by: Wang, Wenxuan, et al.
Published: (2022)

Plan-MCTS: Plan Exploration for Action Exploitation in Web Navigation
by: Zhang, Weiming, et al.
Published: (2026)