Guardado en:
| Autores principales: | Zhang, Jianyi, Liu, Shizhao, Zhou, Ziyin, Li, Zhen |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2512.18755 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
CAPTURE: A Benchmark and Evaluation for LVLMs in CAPTCHA Resolving
por: Zhang, Jianyi, et al.
Publicado: (2025)
por: Zhang, Jianyi, et al.
Publicado: (2025)
Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
por: Zhang, Jianyi, et al.
Publicado: (2025)
por: Zhang, Jianyi, et al.
Publicado: (2025)
Tug-of-War within A Decade: Conflict Resolution in Vulnerability Analysis via Teacher-Guided Retrieval-Augmented Generations
por: Zhou, Ziyin, et al.
Publicado: (2026)
por: Zhou, Ziyin, et al.
Publicado: (2026)
Structured Progressive Knowledge Activation for LLM-Driven Neural Architecture Search
por: Liu, Zhen, et al.
Publicado: (2026)
por: Liu, Zhen, et al.
Publicado: (2026)
Formalization Driven LLM Prompt Jailbreaking via Reinforcement Learning
por: Wang, Zhaoqi, et al.
Publicado: (2025)
por: Wang, Zhaoqi, et al.
Publicado: (2025)
Adaptive Prompt Embedding Optimization for LLM Jailbreaking
por: Li, Miles Q., et al.
Publicado: (2026)
por: Li, Miles Q., et al.
Publicado: (2026)
An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration
por: Li, Yihao, et al.
Publicado: (2024)
por: Li, Yihao, et al.
Publicado: (2024)
Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges
por: Koo, Hamin, et al.
Publicado: (2025)
por: Koo, Hamin, et al.
Publicado: (2025)
Remove Symmetries to Control Model Expressivity and Improve Optimization
por: Ziyin, Liu, et al.
Publicado: (2024)
por: Ziyin, Liu, et al.
Publicado: (2024)
Metis: Learning to Jailbreak LLMs via Self-Evolving Metacognitive Policy Optimization
por: Zhou, Huilin, et al.
Publicado: (2026)
por: Zhou, Huilin, et al.
Publicado: (2026)
Can't say cant? Measuring and Reasoning of Dark Jargons in Large Language Models
por: Ji, Xu, et al.
Publicado: (2024)
por: Ji, Xu, et al.
Publicado: (2024)
Are We Merely Justifying Results ex Post Facto? Quantifying Explanatory Inversion in Post-Hoc Model Explanations
por: Tan, Zhen, et al.
Publicado: (2025)
por: Tan, Zhen, et al.
Publicado: (2025)
Efficient LLM-Jailbreaking via Multimodal-LLM Jailbreak
por: Ji, Haoxuan, et al.
Publicado: (2024)
por: Ji, Haoxuan, et al.
Publicado: (2024)
Pseudo-label Based Domain Adaptation for Zero-Shot Text Steganalysis
por: Luo, Yufei, et al.
Publicado: (2024)
por: Luo, Yufei, et al.
Publicado: (2024)
Advancing Deep Learning through Probability Engineering: A Pragmatic Paradigm for Modern AI
por: Zhang, Jianyi
Publicado: (2025)
por: Zhang, Jianyi
Publicado: (2025)
Jailbreak-as-a-Service++: Unveiling Distributed AI-Driven Malicious Information Campaigns Powered by LLM Crowdsourcing
por: Yan, Yu, et al.
Publicado: (2025)
por: Yan, Yu, et al.
Publicado: (2025)
Beyond Fixed Anchors: Precisely Erasing Concepts with Sibling Exclusive Counterparts
por: Zhang, Tong, et al.
Publicado: (2025)
por: Zhang, Tong, et al.
Publicado: (2025)
Three Mechanisms of Feature Learning in a Linear Network
por: Xu, Yizhou, et al.
Publicado: (2024)
por: Xu, Yizhou, et al.
Publicado: (2024)
MetaBreak: Jailbreaking Online LLM Services via Special Token Manipulation
por: Zhu, Wentian, et al.
Publicado: (2025)
por: Zhu, Wentian, et al.
Publicado: (2025)
Jailbreak-R1: Exploring the Jailbreak Capabilities of LLMs via Reinforcement Learning
por: Guo, Weiyang, et al.
Publicado: (2025)
por: Guo, Weiyang, et al.
Publicado: (2025)
Jailbreaking LLM-Controlled Robots
por: Robey, Alexander, et al.
Publicado: (2024)
por: Robey, Alexander, et al.
Publicado: (2024)
From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms
por: Jiang, Zhaokun, et al.
Publicado: (2025)
por: Jiang, Zhaokun, et al.
Publicado: (2025)
Noise Balance and Stationary Distribution of Stochastic Gradient Descent
por: Ziyin, Liu, et al.
Publicado: (2023)
por: Ziyin, Liu, et al.
Publicado: (2023)
PathSeeker: Exploring LLM Security Vulnerabilities with a Reinforcement Learning-Based Jailbreak Approach
por: Lin, Zhihao, et al.
Publicado: (2024)
por: Lin, Zhihao, et al.
Publicado: (2024)
Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense
por: Ouyang, Yang, et al.
Publicado: (2025)
por: Ouyang, Yang, et al.
Publicado: (2025)
F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World
por: Zhang, Ziyin, et al.
Publicado: (2026)
por: Zhang, Ziyin, et al.
Publicado: (2026)
Boosting Jailbreak Transferability for Large Language Models
por: Liu, Hanqing, et al.
Publicado: (2024)
por: Liu, Hanqing, et al.
Publicado: (2024)
LLM Jailbreak Detection for (Almost) Free!
por: Chen, Guorui, et al.
Publicado: (2025)
por: Chen, Guorui, et al.
Publicado: (2025)
F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data
por: Zhang, Ziyin, et al.
Publicado: (2025)
por: Zhang, Ziyin, et al.
Publicado: (2025)
Geneshift: Impact of different scenario shift on Jailbreaking LLM
por: Wu, Tianyi, et al.
Publicado: (2025)
por: Wu, Tianyi, et al.
Publicado: (2025)
DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
por: Li, Xirui, et al.
Publicado: (2024)
por: Li, Xirui, et al.
Publicado: (2024)
Sirens' Whisper: Inaudible Near-Ultrasonic Jailbreaks of Speech-Driven LLMs
por: Ling, Zijian, et al.
Publicado: (2026)
por: Ling, Zijian, et al.
Publicado: (2026)
Hiding in Plain Sight: A Steganographic Approach to Stealthy LLM Jailbreaks
por: Geng, Jianing, et al.
Publicado: (2025)
por: Geng, Jianing, et al.
Publicado: (2025)
DICE: Disentangling Artist Style from Content via Contrastive Subspace Decomposition in Diffusion Models
por: Zhang, Tong, et al.
Publicado: (2026)
por: Zhang, Tong, et al.
Publicado: (2026)
Unraveling LLM Jailbreaks Through Safety Knowledge Neurons
por: Zhao, Chongwen, et al.
Publicado: (2025)
por: Zhao, Chongwen, et al.
Publicado: (2025)
Bleeding Pathways: Vanishing Discriminability in LLM Hidden States Fuels Jailbreak Attacks
por: Zhang, Yingjie, et al.
Publicado: (2025)
por: Zhang, Yingjie, et al.
Publicado: (2025)
LLM-VA: Resolving the Jailbreak-Overrefusal Trade-off via Vector Alignment
por: Zhang, Haonan, et al.
Publicado: (2026)
por: Zhang, Haonan, et al.
Publicado: (2026)
An Optimizable Suffix Is Worth A Thousand Templates: Efficient Black-box Jailbreaking without Affirmative Phrases via LLM as Optimizer
por: Jiang, Weipeng, et al.
Publicado: (2024)
por: Jiang, Weipeng, et al.
Publicado: (2024)
AutoEDA: Enabling EDA Flow Automation through Microservice-Based LLM Agents
por: Lu, Yiyi, et al.
Publicado: (2025)
por: Lu, Yiyi, et al.
Publicado: (2025)
DRAGON: LLM-Driven Decomposition and Reconstruction Agents for Large-Scale Combinatorial Optimization
por: Chen, Shengkai, et al.
Publicado: (2026)
por: Chen, Shengkai, et al.
Publicado: (2026)
Ejemplares similares
-
CAPTURE: A Benchmark and Evaluation for LVLMs in CAPTCHA Resolving
por: Zhang, Jianyi, et al.
Publicado: (2025) -
Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
por: Zhang, Jianyi, et al.
Publicado: (2025) -
Tug-of-War within A Decade: Conflict Resolution in Vulnerability Analysis via Teacher-Guided Retrieval-Augmented Generations
por: Zhou, Ziyin, et al.
Publicado: (2026) -
Structured Progressive Knowledge Activation for LLM-Driven Neural Architecture Search
por: Liu, Zhen, et al.
Publicado: (2026) -
Formalization Driven LLM Prompt Jailbreaking via Reinforcement Learning
por: Wang, Zhaoqi, et al.
Publicado: (2025)