Saved in:
| Main Authors: | Lyu, Yuefei, Li, Chaozhuo, Zhang, Xi, Zhang, Tianle |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.13276 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Beyond Surface-Level Detection: Towards Cognitive-Driven Defense Against Jailbreak Attacks via Meta-Operations Reasoning
by: Pu, Rui, et al.
Published: (2025)
by: Pu, Rui, et al.
Published: (2025)
Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level
by: Lei, Runlin, et al.
Published: (2024)
by: Lei, Runlin, et al.
Published: (2024)
Graph of Attacks: Improved Black-Box and Interpretable Jailbreaks for LLMs
by: Akbar-Tajari, Mohammad, et al.
Published: (2025)
by: Akbar-Tajari, Mohammad, et al.
Published: (2025)
Effective and Efficient Jailbreaks of Black-Box LLMs with Cross-Behavior Attacks
by: Gohil, Vasudev
Published: (2025)
by: Gohil, Vasudev
Published: (2025)
HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text
by: Liu, Han, et al.
Published: (2024)
by: Liu, Han, et al.
Published: (2024)
LLMs are Introvert
by: Zhang, Litian, et al.
Published: (2025)
by: Zhang, Litian, et al.
Published: (2025)
Reasoning Paths as Signals: Augmenting Multi-hop Fact Verification through Structural Reasoning Progression
by: Zheng, Liwen, et al.
Published: (2025)
by: Zheng, Liwen, et al.
Published: (2025)
Diffusion with a Linguistic Compass: Steering the Generation of Clinically Plausible Future sMRI Representations for Early MCI Conversion Prediction
by: Tang, Zhihao, et al.
Published: (2025)
by: Tang, Zhihao, et al.
Published: (2025)
Ask, Attend, Attack: A Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models
by: Zeng, Qingyuan, et al.
Published: (2024)
by: Zeng, Qingyuan, et al.
Published: (2024)
One SPACE to Rule Them All: Jointly Mitigating Factuality and Faithfulness Hallucinations in LLMs
by: Wang, Pengbo, et al.
Published: (2025)
by: Wang, Pengbo, et al.
Published: (2025)
LiSA: Leveraging Link Recommender to Attack Graph Neural Networks via Subgraph Injection
by: Zhang, Wenlun, et al.
Published: (2025)
by: Zhang, Wenlun, et al.
Published: (2025)
Feint and Attack: Attention-Based Strategies for Jailbreaking and Protecting LLMs
by: Pu, Rui, et al.
Published: (2024)
by: Pu, Rui, et al.
Published: (2024)
Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
by: Mehrotra, Anay, et al.
Published: (2023)
by: Mehrotra, Anay, et al.
Published: (2023)
Where Fake Citations Are Made: Tracing Field-Level Hallucination to Specific Neurons in LLMs
by: Chen, Yuefei, et al.
Published: (2026)
by: Chen, Yuefei, et al.
Published: (2026)
StruPhantom: Evolutionary Injection Attacks on Black-Box Tabular Agents Powered by Large Language Models
by: Feng, Yang, et al.
Published: (2025)
by: Feng, Yang, et al.
Published: (2025)
ExecTune: Effective Steering of Black-Box LLMs with Guide Models
by: Lingam, Vijay, et al.
Published: (2026)
by: Lingam, Vijay, et al.
Published: (2026)
Generalizing Knowledge Graph Embedding with Universal Orthogonal Parameterization
by: Li, Rui, et al.
Published: (2024)
by: Li, Rui, et al.
Published: (2024)
Sharpness-Aware Black-Box Optimization
by: Ye, Feiyang, et al.
Published: (2024)
by: Ye, Feiyang, et al.
Published: (2024)
The Scales of Justitia: A Comprehensive Survey on Safety Evaluation of LLMs
by: Liu, Songyang, et al.
Published: (2025)
by: Liu, Songyang, et al.
Published: (2025)
Matryoshka Pilot: Learning to Drive Black-Box LLMs with LLMs
by: Li, Changhao, et al.
Published: (2024)
by: Li, Changhao, et al.
Published: (2024)
Can LLMs Effectively Leverage Graph Structural Information through Prompts, and Why?
by: Huang, Jin, et al.
Published: (2023)
by: Huang, Jin, et al.
Published: (2023)
Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs
by: Chen, Zihui, et al.
Published: (2026)
by: Chen, Zihui, et al.
Published: (2026)
Towards Effective, Stealthy, and Persistent Backdoor Attacks Targeting Graph Foundation Models
by: Luo, Jiayi, et al.
Published: (2025)
by: Luo, Jiayi, et al.
Published: (2025)
FDLLM: A Dedicated Detector for Black-Box LLMs Fingerprinting
by: Fu, Zhiyuan, et al.
Published: (2025)
by: Fu, Zhiyuan, et al.
Published: (2025)
Conflict-Resilient Multi-Agent Reasoning via Signed Graph Modeling
by: He, Longgang, et al.
Published: (2026)
by: He, Longgang, et al.
Published: (2026)
iDSE: Navigating Design Space Exploration in High-Level Synthesis Using LLMs
by: Li, Runkai, et al.
Published: (2025)
by: Li, Runkai, et al.
Published: (2025)
Automated Detection of Pre-training Text in Black-box LLMs
by: Hu, Ruihan, et al.
Published: (2025)
by: Hu, Ruihan, et al.
Published: (2025)
MirrorShield: Towards Universal Defense Against Jailbreaks via Entropy-Guided Mirror Crafting
by: Pu, Rui, et al.
Published: (2025)
by: Pu, Rui, et al.
Published: (2025)
Towards Black-Box Membership Inference Attack for Diffusion Models
by: Li, Jingwei, et al.
Published: (2024)
by: Li, Jingwei, et al.
Published: (2024)
TrustGLM: Evaluating the Robustness of GraphLLMs Against Prompt, Text, and Structure Attacks
by: Zhang, Qihai, et al.
Published: (2025)
by: Zhang, Qihai, et al.
Published: (2025)
FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs
by: Sawczyn, Albert, et al.
Published: (2025)
by: Sawczyn, Albert, et al.
Published: (2025)
Leveraging Large Language Models for Effective Label-free Node Classification in Text-Attributed Graphs
by: Zhang, Taiyan, et al.
Published: (2024)
by: Zhang, Taiyan, et al.
Published: (2024)
Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks
by: An, Haonan, et al.
Published: (2024)
by: An, Haonan, et al.
Published: (2024)
ShadowCode: Towards (Automatic) External Prompt Injection Attack against Code LLMs
by: Yang, Yuchen, et al.
Published: (2024)
by: Yang, Yuchen, et al.
Published: (2024)
Effective Black-Box Multi-Faceted Attacks Breach Vision Large Language Model Guardrails
by: Yang, Yijun, et al.
Published: (2025)
by: Yang, Yijun, et al.
Published: (2025)
A Survey of Calibration Process for Black-Box LLMs
by: Xie, Liangru, et al.
Published: (2024)
by: Xie, Liangru, et al.
Published: (2024)
Analysis of LLMs Against Prompt Injection and Jailbreak Attacks
by: Jaiswal, Piyush, et al.
Published: (2026)
by: Jaiswal, Piyush, et al.
Published: (2026)
SoK: Pitfalls in Evaluating Black-Box Attacks
by: Suya, Fnu, et al.
Published: (2023)
by: Suya, Fnu, et al.
Published: (2023)
AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs
by: Wang, Che, et al.
Published: (2026)
by: Wang, Che, et al.
Published: (2026)
Revitalizing Black-Box Interpretability: Actionable Interpretability for LLMs via Proxy Models
by: Liu, Junhao, et al.
Published: (2025)
by: Liu, Junhao, et al.
Published: (2025)
Similar Items
-
Beyond Surface-Level Detection: Towards Cognitive-Driven Defense Against Jailbreak Attacks via Meta-Operations Reasoning
by: Pu, Rui, et al.
Published: (2025) -
Intruding with Words: Towards Understanding Graph Injection Attacks at the Text Level
by: Lei, Runlin, et al.
Published: (2024) -
Graph of Attacks: Improved Black-Box and Interpretable Jailbreaks for LLMs
by: Akbar-Tajari, Mohammad, et al.
Published: (2025) -
Effective and Efficient Jailbreaks of Black-Box LLMs with Cross-Behavior Attacks
by: Gohil, Vasudev
Published: (2025) -
HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text
by: Liu, Han, et al.
Published: (2024)