Saved in:
| Main Authors: | Li, Haoxuan, Ma, Mingyu Derek, Huang, Jen-tse, Weng, Zhaotian, Wang, Wei, Zhao, Jieyu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.04855 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective
by: Weng, Zhaotian, et al.
Published: (2024)
by: Weng, Zhaotian, et al.
Published: (2024)
InterIntent: Investigating Social Intelligence of LLMs via Intention Understanding in an Interactive Game Context
by: Liu, Ziyi, et al.
Published: (2024)
by: Liu, Ziyi, et al.
Published: (2024)
On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents
by: Huang, Jen-tse, et al.
Published: (2024)
by: Huang, Jen-tse, et al.
Published: (2024)
CodeCrash: Exposing LLM Fragility to Misleading Natural Language in Code Reasoning
by: Lam, Man Ho, et al.
Published: (2025)
by: Lam, Man Ho, et al.
Published: (2025)
MIRAI: Evaluating LLM Agents for Event Forecasting
by: Ye, Chenchen, et al.
Published: (2024)
by: Ye, Chenchen, et al.
Published: (2024)
On the Failure of Latent State Persistence in Large Language Models
by: Huang, Jen-tse, et al.
Published: (2025)
by: Huang, Jen-tse, et al.
Published: (2025)
FairCoder: Evaluating Social Bias of LLMs in Code Generation
by: Du, Yongkang, et al.
Published: (2025)
by: Du, Yongkang, et al.
Published: (2025)
What's Missing in Vision-Language Models? Probing Their Struggles with Causal Order Reasoning
by: Weng, Zhaotian, et al.
Published: (2025)
by: Weng, Zhaotian, et al.
Published: (2025)
Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing
by: Weng, Zhaotian, et al.
Published: (2026)
by: Weng, Zhaotian, et al.
Published: (2026)
Learning to Ask: When LLM Agents Meet Unclear Instruction
by: Wang, Wenxuan, et al.
Published: (2024)
by: Wang, Wenxuan, et al.
Published: (2024)
Towards More Accurate US Presidential Election via Multi-step Reasoning with Large Language Models
by: Yu, Chenxiao, et al.
Published: (2024)
by: Yu, Chenxiao, et al.
Published: (2024)
FAIRGAMER: Evaluating Social Biases in LLM-Based Video Game NPCs
by: Shi, Bingkang, et al.
Published: (2025)
by: Shi, Bingkang, et al.
Published: (2025)
On the Shortcut Learning in Multilingual Neural Machine Translation
by: Wang, Wenxuan, et al.
Published: (2024)
by: Wang, Wenxuan, et al.
Published: (2024)
Structured Reasoning for Fairness: A Multi-Agent Approach to Bias Detection in Textual Data
by: Huang, Tianyi, et al.
Published: (2025)
by: Huang, Tianyi, et al.
Published: (2025)
Mitigating Bias for Question Answering Models by Tracking Bias Influence
by: Ma, Mingyu Derek, et al.
Published: (2023)
by: Ma, Mingyu Derek, et al.
Published: (2023)
STAR: Boosting Low-Resource Information Extraction by Structure-to-Text Data Generation with Large Language Models
by: Ma, Mingyu Derek, et al.
Published: (2023)
by: Ma, Mingyu Derek, et al.
Published: (2023)
Social Welfare Function Leaderboard: When LLM Agents Allocate Social Welfare
by: Shi, Zhengliang, et al.
Published: (2025)
by: Shi, Zhengliang, et al.
Published: (2025)
AI Sees Your Location, But With A Bias Toward The Wealthy World
by: Huang, Jingyuan, et al.
Published: (2025)
by: Huang, Jingyuan, et al.
Published: (2025)
How to Interpret Agent Behavior
by: Gao, Jie, et al.
Published: (2026)
by: Gao, Jie, et al.
Published: (2026)
New Job, New Gender? Measuring the Social Bias in Image Generation Models
by: Wang, Wenxuan, et al.
Published: (2024)
by: Wang, Wenxuan, et al.
Published: (2024)
Skill-Pro: Learning Reusable Skills from Experience via Non-Parametric PPO for LLM Agents
by: Mi, Qirui, et al.
Published: (2026)
by: Mi, Qirui, et al.
Published: (2026)
VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models
by: Huang, Jen-tse, et al.
Published: (2025)
by: Huang, Jen-tse, et al.
Published: (2025)
Examining Agents' Bias Amplification versus Suppression in Multi-Agent Systems
by: Wu, Zejian Eric, et al.
Published: (2026)
by: Wu, Zejian Eric, et al.
Published: (2026)
Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models
by: Wang, Wenxuan, et al.
Published: (2023)
by: Wang, Wenxuan, et al.
Published: (2023)
SWE-Chain: Benchmarking Coding Agents on Chained Release-Level Package Upgrades
by: Lam, Man Ho, et al.
Published: (2026)
by: Lam, Man Ho, et al.
Published: (2026)
CoSER: A Comprehensive Literary Dataset and Framework for Training and Evaluating LLM Role-Playing and Persona Simulation
by: Wang, Xintao, et al.
Published: (2025)
by: Wang, Xintao, et al.
Published: (2025)
GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation
by: He, Jiashu, et al.
Published: (2024)
by: He, Jiashu, et al.
Published: (2024)
Exploring the Impact of Personality Traits on LLM Bias and Toxicity
by: Wang, Shuo, et al.
Published: (2025)
by: Wang, Shuo, et al.
Published: (2025)
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
by: Huang, Jen-tse, et al.
Published: (2024)
by: Huang, Jen-tse, et al.
Published: (2024)
All Languages Matter: On the Multilingual Safety of Large Language Models
by: Wang, Wenxuan, et al.
Published: (2023)
by: Wang, Wenxuan, et al.
Published: (2023)
To Call or Not to Call: Diagnosing Intrinsic Over-Calling Bias in LLM Agents
by: Shi, Wei, et al.
Published: (2026)
by: Shi, Wei, et al.
Published: (2026)
MMR-Bench: A Comprehensive Benchmark for Multimodal LLM Routing
by: Ma, Haoxuan, et al.
Published: (2026)
by: Ma, Haoxuan, et al.
Published: (2026)
Semantic Trajectory Data Mining with LLM-Informed POI Classification
by: Liu, Yifan, et al.
Published: (2024)
by: Liu, Yifan, et al.
Published: (2024)
Structured Personality Control and Adaptation for LLM Agents
by: Wang, Jinpeng, et al.
Published: (2026)
by: Wang, Jinpeng, et al.
Published: (2026)
A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?
by: Chen, Ada, et al.
Published: (2025)
by: Chen, Ada, et al.
Published: (2025)
Towards Evaluating Proactive Risk Awareness of Multimodal Language Models
by: Yuan, Youliang, et al.
Published: (2025)
by: Yuan, Youliang, et al.
Published: (2025)
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
by: Yuan, Youliang, et al.
Published: (2024)
by: Yuan, Youliang, et al.
Published: (2024)
"You Gotta be a Doctor, Lin": An Investigation of Name-Based Bias of Large Language Models in Employment Recommendations
by: Nghiem, Huy, et al.
Published: (2024)
by: Nghiem, Huy, et al.
Published: (2024)
Safer-Instruct: Aligning Language Models with Automated Preference Data
by: Shi, Taiwei, et al.
Published: (2023)
by: Shi, Taiwei, et al.
Published: (2023)
Does Differential Privacy Impact Bias in Pretrained NLP Models?
by: Islam, Md. Khairul, et al.
Published: (2024)
by: Islam, Md. Khairul, et al.
Published: (2024)
Similar Items
-
Images Speak Louder than Words: Understanding and Mitigating Bias in Vision-Language Model from a Causal Mediation Perspective
by: Weng, Zhaotian, et al.
Published: (2024) -
InterIntent: Investigating Social Intelligence of LLMs via Intention Understanding in an Interactive Game Context
by: Liu, Ziyi, et al.
Published: (2024) -
On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents
by: Huang, Jen-tse, et al.
Published: (2024) -
CodeCrash: Exposing LLM Fragility to Misleading Natural Language in Code Reasoning
by: Lam, Man Ho, et al.
Published: (2025) -
MIRAI: Evaluating LLM Agents for Event Forecasting
by: Ye, Chenchen, et al.
Published: (2024)