:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Xuying, Li, Zhuo, Kosuga, Yuji, Yoshida, Yasuhiro, Bian, Victor
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2412.04415
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Precision Knowledge Editing: Enhancing Safety in Large Language Models
by: Li, Xuying, et al.
Published: (2024)

Output Length Effect on DeepSeek-R1's Safety in Forced Thinking
by: Li, Xuying, et al.
Published: (2025)

Layer-Wise Perturbations via Sparse Autoencoders for Adversarial Text Generation
by: Shu, Huizhen, et al.
Published: (2025)

Optimizing Safe and Aligned Language Generation: A Multi-Objective GRPO Approach
by: Li, Xuying, et al.
Published: (2025)

LatentGuard: Controllable Latent Steering for Robust Refusal of Attacks and Reliable Response Generation
by: Shu, Huizhen, et al.
Published: (2025)

The Resurgence of GCG Adversarial Attacks on Large Language Models
by: Tan, Yuting, et al.
Published: (2025)

Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks
by: Li, Ang, et al.
Published: (2025)

Targeted Bit-Flip Attacks on LLM-Based Agents
by: Wang, Jialai, et al.
Published: (2026)

Controllable Mathematical Reasoning via Self-Optimizing Thought Vectors
by: LI, Xuying
Published: (2025)

LLM-based Agents Suffer from Hallucinations: A Survey of Taxonomy, Methods, and Directions
by: Lin, Xixun, et al.
Published: (2025)

E^2GraphRAG: Streamlining Graph-based RAG for High Efficiency and Effectiveness
by: Zhao, Yibo, et al.
Published: (2025)

CheatAgent: Attacking LLM-Empowered Recommender Systems via LLM Agent
by: Ning, Liang-bo, et al.
Published: (2025)

FT-Dojo: Towards Autonomous LLM Fine-Tuning with Language Agents
by: Li, Qizheng, et al.
Published: (2026)

AutoBackdoor: Automating Backdoor Attacks via LLM Agents
by: Li, Yige, et al.
Published: (2025)

Manipulating LLM Web Agents with Indirect Prompt Injection Attack via HTML Accessibility Tree
by: Johnson, Sam, et al.
Published: (2025)

BackdoorAgent: A Unified Framework for Backdoor Attacks on LLM-based Agents
by: Feng, Yunhao, et al.
Published: (2026)

IP Leakage Attacks Targeting LLM-Based Multi-Agent Systems
by: Wang, Liwen, et al.
Published: (2025)

Agent^2 RL-Bench: Can LLM Agents Engineer Agentic RL Post-Training?
by: Chen, Wanyi, et al.
Published: (2026)

Doppelganger Method: Breaking Role Consistency in LLM Agent via Prompt-based Transferable Adversarial Attack
by: Kang, Daewon, et al.
Published: (2025)

S$^4$ST: A Strong, Self-transferable, faSt, and Simple Scale Transformation for Transferable Targeted Attack
by: Liu, Yongxiang, et al.
Published: (2024)

RAG-targeted Adversarial Attack on LLM-based Threat Detection and Mitigation Framework
by: Ikbarieh, Seif, et al.
Published: (2025)

SEP-Attack: A Simple and Effective Paradigm for Transfer-Based Textual Adversarial Attack
by: Liu, Han, et al.
Published: (2026)

E-MIA: Exam-Style Black-Box Membership Inference Attacks against RAG Systems
by: Guan, Zelin, et al.
Published: (2026)

Micro-Act: Mitigating Knowledge Conflict in LLM-based RAG via Actionable Self-Reasoning
by: Huo, Nan, et al.
Published: (2025)

Attractive Metadata Attack: Inducing LLM Agents to Invoke Malicious Tools
by: Mo, Kanghua, et al.
Published: (2025)

EDGE: Efficient Data Selection for LLM Agents via Guideline Effectiveness
by: Zhang, Yunxiao, et al.
Published: (2025)

Vul-RAG: Enhancing LLM-based Vulnerability Detection via Knowledge-level RAG
by: Du, Xueying, et al.
Published: (2024)

Amplified Vulnerabilities: Structured Jailbreak Attacks on LLM-based Multi-Agent Debate
by: Qi, Senmao, et al.
Published: (2025)

A Concurrent Modular Agent: Framework for Autonomous LLM Agents
by: Maruyama, Norihiro, et al.
Published: (2025)

An Iterative LLM Framework for SIBT utilizing RAG-based Adaptive Weight Optimization
by: Xiao, Zhuo, et al.
Published: (2025)

Dive into Ambiguity: A*-Inspired Multi-Agents Commonsense Obfuscation Attack on LLM Prompts
by: Wang, Boxuan, et al.
Published: (2026)

A Simple and Effective Method for Uncertainty Quantification and OOD Detection
by: Ma, Yaxin, et al.
Published: (2025)

ALERT: Zero-shot LLM Jailbreak Detection via Internal Discrepancy Amplification
by: Lin, Xiao, et al.
Published: (2026)

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search
by: Shen, Zeyu, et al.
Published: (2025)

SimpleMem: Efficient Lifelong Memory for LLM Agents
by: Liu, Jiaqi, et al.
Published: (2026)

Towards Effective, Stealthy, and Persistent Backdoor Attacks Targeting Graph Foundation Models
by: Luo, Jiayi, et al.
Published: (2025)

AgentInit: Initializing LLM-based Multi-Agent Systems via Diversity and Expertise Orchestration for Effective and Efficient Collaboration
by: Tian, Chunhao, et al.
Published: (2025)

Evaluating LLM-based Agents for Multi-Turn Conversations: A Survey
by: Guan, Shengyue, et al.
Published: (2025)

RAG-Enhanced Collaborative LLM Agents for Drug Discovery
by: Lee, Namkyeong, et al.
Published: (2025)

VisualTrap: A Stealthy Backdoor Attack on GUI Agents via Visual Grounding Manipulation
by: Ye, Ziang, et al.
Published: (2025)