:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Zhang, Jianyi, Liu, Shizhao, Zhou, Ziyin, Li, Zhen
Formato:	Preprint
Publicado:	2025
Materias:	Artificial Intelligence
Acceso en línea:	https://arxiv.org/abs/2512.18755
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

CAPTURE: A Benchmark and Evaluation for LVLMs in CAPTCHA Resolving
por: Zhang, Jianyi, et al.
Publicado: (2025)

Oedipus and the Sphinx: Benchmarking and Improving Visual Language Models for Complex Graphic Reasoning
por: Zhang, Jianyi, et al.
Publicado: (2025)

Tug-of-War within A Decade: Conflict Resolution in Vulnerability Analysis via Teacher-Guided Retrieval-Augmented Generations
por: Zhou, Ziyin, et al.
Publicado: (2026)

Structured Progressive Knowledge Activation for LLM-Driven Neural Architecture Search
por: Liu, Zhen, et al.
Publicado: (2026)

Formalization Driven LLM Prompt Jailbreaking via Reinforcement Learning
por: Wang, Zhaoqi, et al.
Publicado: (2025)

Adaptive Prompt Embedding Optimization for LLM Jailbreaking
por: Li, Miles Q., et al.
Publicado: (2026)

An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration
por: Li, Yihao, et al.
Publicado: (2024)

Align to Misalign: Automatic LLM Jailbreak with Meta-Optimized LLM Judges
por: Koo, Hamin, et al.
Publicado: (2025)

Remove Symmetries to Control Model Expressivity and Improve Optimization
por: Ziyin, Liu, et al.
Publicado: (2024)

Metis: Learning to Jailbreak LLMs via Self-Evolving Metacognitive Policy Optimization
por: Zhou, Huilin, et al.
Publicado: (2026)

Can't say cant? Measuring and Reasoning of Dark Jargons in Large Language Models
por: Ji, Xu, et al.
Publicado: (2024)

Are We Merely Justifying Results ex Post Facto? Quantifying Explanatory Inversion in Post-Hoc Model Explanations
por: Tan, Zhen, et al.
Publicado: (2025)

Efficient LLM-Jailbreaking via Multimodal-LLM Jailbreak
por: Ji, Haoxuan, et al.
Publicado: (2024)

Pseudo-label Based Domain Adaptation for Zero-Shot Text Steganalysis
por: Luo, Yufei, et al.
Publicado: (2024)

Advancing Deep Learning through Probability Engineering: A Pragmatic Paradigm for Modern AI
por: Zhang, Jianyi
Publicado: (2025)

Jailbreak-as-a-Service++: Unveiling Distributed AI-Driven Malicious Information Campaigns Powered by LLM Crowdsourcing
por: Yan, Yu, et al.
Publicado: (2025)

Beyond Fixed Anchors: Precisely Erasing Concepts with Sibling Exclusive Counterparts
por: Zhang, Tong, et al.
Publicado: (2025)

Three Mechanisms of Feature Learning in a Linear Network
por: Xu, Yizhou, et al.
Publicado: (2024)

MetaBreak: Jailbreaking Online LLM Services via Special Token Manipulation
por: Zhu, Wentian, et al.
Publicado: (2025)

Jailbreak-R1: Exploring the Jailbreak Capabilities of LLMs via Reinforcement Learning
por: Guo, Weiyang, et al.
Publicado: (2025)

Jailbreaking LLM-Controlled Robots
por: Robey, Alexander, et al.
Publicado: (2024)

From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms
por: Jiang, Zhaokun, et al.
Publicado: (2025)

Noise Balance and Stationary Distribution of Stochastic Gradient Descent
por: Ziyin, Liu, et al.
Publicado: (2023)

PathSeeker: Exploring LLM Security Vulnerabilities with a Reinforcement Learning-Based Jailbreak Approach
por: Lin, Zhihao, et al.
Publicado: (2024)

Layer-Level Self-Exposure and Patch: Affirmative Token Mitigation for Jailbreak Attack Defense
por: Ouyang, Yang, et al.
Publicado: (2025)

F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World
por: Zhang, Ziyin, et al.
Publicado: (2026)

Boosting Jailbreak Transferability for Large Language Models
por: Liu, Hanqing, et al.
Publicado: (2024)

LLM Jailbreak Detection for (Almost) Free!
por: Chen, Guorui, et al.
Publicado: (2025)

F2LLM Technical Report: Matching SOTA Embedding Performance with 6 Million Open-Source Data
por: Zhang, Ziyin, et al.
Publicado: (2025)

Geneshift: Impact of different scenario shift on Jailbreaking LLM
por: Wu, Tianyi, et al.
Publicado: (2025)

DrAttack: Prompt Decomposition and Reconstruction Makes Powerful LLM Jailbreakers
por: Li, Xirui, et al.
Publicado: (2024)

Sirens' Whisper: Inaudible Near-Ultrasonic Jailbreaks of Speech-Driven LLMs
por: Ling, Zijian, et al.
Publicado: (2026)

Hiding in Plain Sight: A Steganographic Approach to Stealthy LLM Jailbreaks
por: Geng, Jianing, et al.
Publicado: (2025)

DICE: Disentangling Artist Style from Content via Contrastive Subspace Decomposition in Diffusion Models
por: Zhang, Tong, et al.
Publicado: (2026)

Unraveling LLM Jailbreaks Through Safety Knowledge Neurons
por: Zhao, Chongwen, et al.
Publicado: (2025)

Bleeding Pathways: Vanishing Discriminability in LLM Hidden States Fuels Jailbreak Attacks
por: Zhang, Yingjie, et al.
Publicado: (2025)

LLM-VA: Resolving the Jailbreak-Overrefusal Trade-off via Vector Alignment
por: Zhang, Haonan, et al.
Publicado: (2026)

An Optimizable Suffix Is Worth A Thousand Templates: Efficient Black-box Jailbreaking without Affirmative Phrases via LLM as Optimizer
por: Jiang, Weipeng, et al.
Publicado: (2024)

AutoEDA: Enabling EDA Flow Automation through Microservice-Based LLM Agents
por: Lu, Yiyi, et al.
Publicado: (2025)

DRAGON: LLM-Driven Decomposition and Reconstruction Agents for Large-Scale Combinatorial Optimization
por: Chen, Shengkai, et al.
Publicado: (2026)