Saved in:
| Main Authors: | Liu, Yifei, Cui, Yu, Zhang, Haibin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.17106 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Red Teaming Large Language Models for Healthcare
by: Balazadeh, Vahid, et al.
Published: (2025)
by: Balazadeh, Vahid, et al.
Published: (2025)
Resource Consumption Red-Teaming for Large Vision-Language Models
by: Gao, Haoran, et al.
Published: (2025)
by: Gao, Haoran, et al.
Published: (2025)
RedTopic: Toward Topic-Diverse Red Teaming of Large Language Models
by: Ding, Jiale, et al.
Published: (2025)
by: Ding, Jiale, et al.
Published: (2025)
Red Teaming Visual Language Models
by: Li, Mukai, et al.
Published: (2024)
by: Li, Mukai, et al.
Published: (2024)
Stop Fixating on Prompts: Reasoning Hijacking and Constraint Tightening for Red-Teaming LLM Agents
by: Mao, Yanxu, et al.
Published: (2026)
by: Mao, Yanxu, et al.
Published: (2026)
RedBench: A Universal Dataset for Comprehensive Red Teaming of Large Language Models
by: Dang, Quy-Anh, et al.
Published: (2026)
by: Dang, Quy-Anh, et al.
Published: (2026)
Learning-Based Automated Adversarial Red-Teaming for Robustness Evaluation of Large Language Models
by: Wei, Zhang, et al.
Published: (2025)
by: Wei, Zhang, et al.
Published: (2025)
Red-Teaming for Inducing Societal Bias in Large Language Models
by: Luo, Chu Fei, et al.
Published: (2024)
by: Luo, Chu Fei, et al.
Published: (2024)
RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent
by: Xu, Huiyu, et al.
Published: (2024)
by: Xu, Huiyu, et al.
Published: (2024)
Gradient-Based Language Model Red Teaming
by: Wichers, Nevan, et al.
Published: (2024)
by: Wichers, Nevan, et al.
Published: (2024)
GenBreak: Red Teaming Text-to-Image Generators Using Large Language Models
by: Wang, Zilong, et al.
Published: (2025)
by: Wang, Zilong, et al.
Published: (2025)
TOOLCAD: Exploring Tool-Using Large Language Models in Text-to-CAD Generation with Reinforcement Learning
by: Gong, Yifei, et al.
Published: (2026)
by: Gong, Yifei, et al.
Published: (2026)
TroubleLLM: Align to Red Team Expert
by: Xu, Zhuoer, et al.
Published: (2024)
by: Xu, Zhuoer, et al.
Published: (2024)
TRIDENT: Enhancing Large Language Model Safety with Tri-Dimensional Diversified Red-Teaming Data Synthesis
by: Wu, Xiaorui, et al.
Published: (2025)
by: Wu, Xiaorui, et al.
Published: (2025)
Teaching Thinking Models to Reason with Tools: A Full-Pipeline Recipe for Tool-Integrated Reasoning
by: Cheng, Qianjia, et al.
Published: (2026)
by: Cheng, Qianjia, et al.
Published: (2026)
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models
by: Liu, Yanjiang, et al.
Published: (2025)
by: Liu, Yanjiang, et al.
Published: (2025)
Red Teaming Language Models for Processing Contradictory Dialogues
by: Wen, Xiaofei, et al.
Published: (2024)
by: Wen, Xiaofei, et al.
Published: (2024)
Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
by: Verma, Apurv, et al.
Published: (2024)
by: Verma, Apurv, et al.
Published: (2024)
Jailbreak-Zero: A Path to Pareto Optimal Red Teaming for Large Language Models
by: Hu, Kai, et al.
Published: (2025)
by: Hu, Kai, et al.
Published: (2025)
When Prompt Optimization Becomes Jailbreaking: Adaptive Red-Teaming of Large Language Models
by: Shamsi, Zafir, et al.
Published: (2026)
by: Shamsi, Zafir, et al.
Published: (2026)
Enhancing Tool Learning in Large Language Models with Hierarchical Error Checklists
by: Cui, Yue, et al.
Published: (2025)
by: Cui, Yue, et al.
Published: (2025)
Building Safe GenAI Applications: An End-to-End Overview of Red Teaming for Large Language Models
by: Purpura, Alberto, et al.
Published: (2025)
by: Purpura, Alberto, et al.
Published: (2025)
Red Teaming for Large Language Models At Scale: Tackling Hallucinations on Mathematics Tasks
by: Buszydlik, Aleksander, et al.
Published: (2023)
by: Buszydlik, Aleksander, et al.
Published: (2023)
Re-Initialization Token Learning for Tool-Augmented Large Language Models
by: Li, Chenghao, et al.
Published: (2025)
by: Li, Chenghao, et al.
Published: (2025)
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
by: Yang, Ling, et al.
Published: (2024)
by: Yang, Ling, et al.
Published: (2024)
Chain-of-Knowledge: Integrating Knowledge Reasoning into Large Language Models by Learning from Knowledge Graphs
by: Zhang, Yifei, et al.
Published: (2024)
by: Zhang, Yifei, et al.
Published: (2024)
Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning
by: Hua, Jiacheng, et al.
Published: (2026)
by: Hua, Jiacheng, et al.
Published: (2026)
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models
by: Qian, Cheng, et al.
Published: (2023)
by: Qian, Cheng, et al.
Published: (2023)
EmoLLM: Appraisal-Grounded Cognitive-Emotional Co-Reasoning in Large Language Models
by: Zhang, Yifei, et al.
Published: (2026)
by: Zhang, Yifei, et al.
Published: (2026)
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models
by: Fan, Zhiting, et al.
Published: (2025)
by: Fan, Zhiting, et al.
Published: (2025)
STAR: SocioTechnical Approach to Red Teaming Language Models
by: Weidinger, Laura, et al.
Published: (2024)
by: Weidinger, Laura, et al.
Published: (2024)
MUR: Momentum Uncertainty guided Reasoning for Large Language Models
by: Yan, Hang, et al.
Published: (2025)
by: Yan, Hang, et al.
Published: (2025)
StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models
by: Guo, Zhicheng, et al.
Published: (2024)
by: Guo, Zhicheng, et al.
Published: (2024)
MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models
by: Yan, Siyu, et al.
Published: (2025)
by: Yan, Siyu, et al.
Published: (2025)
TInR: Exploring Tool-Internalized Reasoning in Large Language Models
by: Xu, Qiancheng, et al.
Published: (2026)
by: Xu, Qiancheng, et al.
Published: (2026)
Red Teaming Multimodal Language Models: Evaluating Harm Across Prompt Modalities and Models
by: Van Doren, Madison, et al.
Published: (2025)
by: Van Doren, Madison, et al.
Published: (2025)
ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming
by: Tedeschi, Simone, et al.
Published: (2024)
by: Tedeschi, Simone, et al.
Published: (2024)
Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning
by: Zhang, Shaokun, et al.
Published: (2025)
by: Zhang, Shaokun, et al.
Published: (2025)
Holistic Automated Red Teaming for Large Language Models through Top-Down Test Case Generation and Multi-turn Interaction
by: Zhang, Jinchuan, et al.
Published: (2024)
by: Zhang, Jinchuan, et al.
Published: (2024)
MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning
by: Das, Debrup, et al.
Published: (2024)
by: Das, Debrup, et al.
Published: (2024)
Similar Items
-
Red Teaming Large Language Models for Healthcare
by: Balazadeh, Vahid, et al.
Published: (2025) -
Resource Consumption Red-Teaming for Large Vision-Language Models
by: Gao, Haoran, et al.
Published: (2025) -
RedTopic: Toward Topic-Diverse Red Teaming of Large Language Models
by: Ding, Jiale, et al.
Published: (2025) -
Red Teaming Visual Language Models
by: Li, Mukai, et al.
Published: (2024) -
Stop Fixating on Prompts: Reasoning Hijacking and Constraint Tightening for Red-Teaming LLM Agents
by: Mao, Yanxu, et al.
Published: (2026)