:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lyu, Yuefei, Li, Chaozhuo, Zhang, Xi, Zhang, Tianle
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2506.13276
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Beyond Surface-Level Detection: Towards Cognitive-Driven Defense Against Jailbreak Attacks via Meta-Operations Reasoning
by: Pu, Rui, et al.
Published: (2025)

Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level
by: Lei, Runlin, et al.
Published: (2024)

Graph of Attacks: Improved Black-Box and Interpretable Jailbreaks for LLMs
by: Akbar-Tajari, Mohammad, et al.
Published: (2025)

Effective and Efficient Jailbreaks of Black-Box LLMs with Cross-Behavior Attacks
by: Gohil, Vasudev
Published: (2025)

HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text
by: Liu, Han, et al.
Published: (2024)

LLMs are Introvert
by: Zhang, Litian, et al.
Published: (2025)

Reasoning Paths as Signals: Augmenting Multi-hop Fact Verification through Structural Reasoning Progression
by: Zheng, Liwen, et al.
Published: (2025)

Diffusion with a Linguistic Compass: Steering the Generation of Clinically Plausible Future sMRI Representations for Early MCI Conversion Prediction
by: Tang, Zhihao, et al.
Published: (2025)

Ask, Attend, Attack: A Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models
by: Zeng, Qingyuan, et al.
Published: (2024)

One SPACE to Rule Them All: Jointly Mitigating Factuality and Faithfulness Hallucinations in LLMs
by: Wang, Pengbo, et al.
Published: (2025)

LiSA: Leveraging Link Recommender to Attack Graph Neural Networks via Subgraph Injection
by: Zhang, Wenlun, et al.
Published: (2025)

Feint and Attack: Attention-Based Strategies for Jailbreaking and Protecting LLMs
by: Pu, Rui, et al.
Published: (2024)

Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
by: Mehrotra, Anay, et al.
Published: (2023)

Where Fake Citations Are Made: Tracing Field-Level Hallucination to Specific Neurons in LLMs
by: Chen, Yuefei, et al.
Published: (2026)

StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models
by: Feng, Yang, et al.
Published: (2025)

ExecTune: Effective Steering of Black-Box LLMs with Guide Models
by: Lingam, Vijay, et al.
Published: (2026)

Generalizing Knowledge Graph Embedding with Universal Orthogonal Parameterization
by: Li, Rui, et al.
Published: (2024)

Sharpness-Aware Black-Box Optimization
by: Ye, Feiyang, et al.
Published: (2024)

The Scales of Justitia: A Comprehensive Survey on Safety Evaluation of LLMs
by: Liu, Songyang, et al.
Published: (2025)

Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs
by: Li, Changhao, et al.
Published: (2024)

Can LLMs Effectively Leverage Graph Structural Information through Prompts, and Why?
by: Huang, Jin, et al.
Published: (2023)

Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs
by: Chen, Zihui, et al.
Published: (2026)

Towards Effective, Stealthy, and Persistent Backdoor Attacks Targeting Graph Foundation Models
by: Luo, Jiayi, et al.
Published: (2025)

FDLLM: A Dedicated Detector for Black-Box LLMs Fingerprinting
by: Fu, Zhiyuan, et al.
Published: (2025)

Conflict-Resilient Multi-Agent Reasoning via Signed Graph Modeling
by: He, Longgang, et al.
Published: (2026)

iDSE: Navigating Design Space Exploration in High-Level Synthesis Using LLMs
by: Li, Runkai, et al.
Published: (2025)

Automated Detection of Pre-training Text in Black-box LLMs
by: Hu, Ruihan, et al.
Published: (2025)

MirrorShield: Towards Universal Defense Against Jailbreaks via Entropy-Guided Mirror Crafting
by: Pu, Rui, et al.
Published: (2025)

Towards Black-Box Membership Inference Attack for Diffusion Models
by: Li, Jingwei, et al.
Published: (2024)

TrustGLM: Evaluating the Robustness of GraphLLMs Against Prompt, Text, and Structure Attacks
by: Zhang, Qihai, et al.
Published: (2025)

FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs
by: Sawczyn, Albert, et al.
Published: (2025)

Leveraging Large Language Models for Effective Label-free Node Classification in Text-Attributed Graphs
by: Zhang, Taiyan, et al.
Published: (2024)

Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks
by: An, Haonan, et al.
Published: (2024)

ShadowCode: Towards (Automatic) External Prompt Injection Attack against Code LLMs
by: Yang, Yuchen, et al.
Published: (2024)

Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails
by: Yang, Yijun, et al.
Published: (2025)

A Survey of Calibration Process for Black-Box LLMs
by: Xie, Liangru, et al.
Published: (2024)

Analysis of LLMs Against Prompt Injection and Jailbreak Attacks
by: Jaiswal, Piyush, et al.
Published: (2026)

SoK: Pitfalls in Evaluating Black-Box Attacks
by: Suya, Fnu, et al.
Published: (2023)

AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs
by: Wang, Che, et al.
Published: (2026)

Revitalizing Black-Box Interpretability: Actionable Interpretability for LLMs via Proxy Models
by: Liu, Junhao, et al.
Published: (2025)