Saved in:
| Main Authors: | Bullwinkel, Blake, Minnich, Amanda, Chawla, Shiven, Lopez, Gary, Pouliot, Martin, Maxwell, Whitney, de Gruyter, Joris, Pratt, Katherine, Qi, Saphir, Chikanov, Nina, Lutz, Roman, Dheekonda, Raja Sekhar Rao, Jagdagdorj, Bolor-Erdene, Kim, Eugenia, Song, Justin, Hines, Keegan, Jones, Daniel, Severi, Giorgio, Lundeen, Richard, Vaughan, Sam, Westerhoff, Victoria, Bryan, Pete, Kumar, Ram Shankar Siva, Zunger, Yonatan, Kawaguchi, Chang, Russinovich, Mark |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.07238 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PyRIT: A Framework for Security Risk Identification and Red Teaming in Generative AI System
by: Munoz, Gary D. Lopez, et al.
Published: (2024)
by: Munoz, Gary D. Lopez, et al.
Published: (2024)
The Trigger in the Haystack: Extracting and Reconstructing LLM Backdoor Triggers
by: Bullwinkel, Blake, et al.
Published: (2026)
by: Bullwinkel, Blake, et al.
Published: (2026)
A Representation Engineering Perspective on the Effectiveness of Multi-Turn Jailbreaks
by: Bullwinkel, Blake, et al.
Published: (2025)
by: Bullwinkel, Blake, et al.
Published: (2025)
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle
by: Haider, Emman, et al.
Published: (2024)
by: Haider, Emman, et al.
Published: (2024)
GRP-Obliteration: Unaligning LLMs With a Single Unlabeled Prompt
by: Russinovich, Mark, et al.
Published: (2026)
by: Russinovich, Mark, et al.
Published: (2026)
A Systematization of Security Vulnerabilities in Computer Use Agents
by: Jones, Daniel, et al.
Published: (2025)
by: Jones, Daniel, et al.
Published: (2025)
Defending Against Indirect Prompt Injection Attacks With Spotlighting
by: Hines, Keegan, et al.
Published: (2024)
by: Hines, Keegan, et al.
Published: (2024)
CEO Locality and Employment Stickiness
by: Sohee Park, et al.
Published: (2024)
by: Sohee Park, et al.
Published: (2024)
Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours
by: Dheekonda, Raja Sekhar Rao, et al.
Published: (2026)
by: Dheekonda, Raja Sekhar Rao, et al.
Published: (2026)
The History and Development of Rural Public Libraries.
by: deGruyter, Lisa
Published: (1980)
by: deGruyter, Lisa
Published: (1980)
The Small Public Library in the U.S.A.
by: DeGruyter, Lisa, et al.
Published: (1988)
by: DeGruyter, Lisa, et al.
Published: (1988)
Jailbreaking is (Mostly) Simpler Than You Think
by: Russinovich, Mark, et al.
Published: (2025)
by: Russinovich, Mark, et al.
Published: (2025)
Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models
by: Russinovich, Mark, et al.
Published: (2025)
by: Russinovich, Mark, et al.
Published: (2025)
Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique
by: Russinovich, Mark, et al.
Published: (2024)
by: Russinovich, Mark, et al.
Published: (2024)
The Role of Microcomputers in Libraries.
by: Lundeen, Gerald
Published: (1980)
by: Lundeen, Gerald
Published: (1980)
Preparative, Enactive, and Intertwined Theories of Change: Cultural Practitioners Influencing Conflict in Ecuador
by: Sarah Ullom-Minnich
Published: (2019)
by: Sarah Ullom-Minnich
Published: (2019)
Lightning detection rates and wildland fire in the mountains of northern Baja California, Mexico
by: Richard A. Minnich
Published: (1993)
by: Richard A. Minnich
Published: (1993)
Microforms & Secondary Schools
by: Minnich, Nancy P.
Published: (1972)
by: Minnich, Nancy P.
Published: (1972)
The "Clipping Thesis": A Year Later.
by: Minnich, Nancy P.
Published: (1987)
by: Minnich, Nancy P.
Published: (1987)
Polyominoes with maximal number of deep holes
by: Baralic, Djordje, et al.
Published: (2026)
by: Baralic, Djordje, et al.
Published: (2026)
Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack
by: Russinovich, Mark, et al.
Published: (2024)
by: Russinovich, Mark, et al.
Published: (2024)
Biosorption of humic and fulvic acids to live activated sludge biomass
by: Esparza, M., Westerhoff, P
Published: (2003)
by: Esparza, M., Westerhoff, P
Published: (2003)
Nitrate removal in zero-valent iron packed columns
by: Westerhoff, P., James, J
Published: (2003)
by: Westerhoff, P., James, J
Published: (2003)
The green transition of firms: The role of evolutionary competition, adjustment costs, transition risk, and green technology progress
by: Radi, Davide, et al.
Published: (2024)
by: Radi, Davide, et al.
Published: (2024)
Microcomputer-Based Library Catalog Software.
by: Lundeen, Gerald, et al.
Published: (1984)
by: Lundeen, Gerald, et al.
Published: (1984)
Energy-lowering symmetry breaking creates a flat-band insulator in paramagnetic Nb3Cl8
by: Xiong, Jia-Xin, et al.
Published: (2024)
by: Xiong, Jia-Xin, et al.
Published: (2024)
An Adversarial Approach to Structural Estimation
by: Kaji, Tetsuya, et al.
Published: (2020)
by: Kaji, Tetsuya, et al.
Published: (2020)
Which GRS Statistic Is Appropriate for Cross‐Sectional Tests of Linear Multi‐Factor Pricing Models?
by: Dimitrios Asteriou, et al.
Published: (2026)
by: Dimitrios Asteriou, et al.
Published: (2026)
Valleytronics and negative differential resistance in cubic boron nitride: a first-principles study
by: Hatanpää, Benjamin, et al.
Published: (2024)
by: Hatanpää, Benjamin, et al.
Published: (2024)
Symmetry breaking transforms strong to normal correlation and false metals to true insulators
by: Zunger, Alex, et al.
Published: (2025)
by: Zunger, Alex, et al.
Published: (2025)
Software Choices for In-House Databases.
by: Tenopir, Carol, et al.
Published: (1988)
by: Tenopir, Carol, et al.
Published: (1988)
LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs
by: Cai, Yanan, et al.
Published: (2025)
by: Cai, Yanan, et al.
Published: (2025)
Degrees of Freedom and Information Criteria for the Synthetic Control Method
by: Pouliot, Guillaume Allaire, et al.
Published: (2022)
by: Pouliot, Guillaume Allaire, et al.
Published: (2022)
Group Shapley Value and Counterfactual Simulations in a Structural Model
by: Kwon, Yongchan, et al.
Published: (2024)
by: Kwon, Yongchan, et al.
Published: (2024)
Atomic layer etching of SiO$_2$ using sequential exposures of Al(CH$_3$)$_3$ and H$_2$/SF$_6$ plasma
by: Catherall, David, et al.
Published: (2024)
by: Catherall, David, et al.
Published: (2024)
Evolutionary poker lacks a full deck when modelling the LTEE Cit+ phenotype
by: Scott A. Minnich, et al.
Published: (2025)
by: Scott A. Minnich, et al.
Published: (2025)
OMNIGUARD: An Efficient Approach for AI Safety Moderation Across Languages and Modalities
by: Verma, Sahil, et al.
Published: (2025)
by: Verma, Sahil, et al.
Published: (2025)
Fake News and Asset Price Dynamics
by: Pellizzari, Paolo, et al.
Published: (2024)
by: Pellizzari, Paolo, et al.
Published: (2024)
Reddit as a Relational Ecosystem: Understanding Ambivalence, Anonymous Suicide Disclosures, and Peer Responses
by: Jessica Meléndez Tyler, et al.
Published: (2026)
by: Jessica Meléndez Tyler, et al.
Published: (2026)
Cutting Through Stigma: Suggested Best Practices for a Harm Reduction Approach to Nonsuicidal Self‐Injury
by: Lindsay A. Lundeen, et al.
Published: (2025)
by: Lindsay A. Lundeen, et al.
Published: (2025)
Similar Items
-
PyRIT: A Framework for Security Risk Identification and Red Teaming in Generative AI System
by: Munoz, Gary D. Lopez, et al.
Published: (2024) -
The Trigger in the Haystack: Extracting and Reconstructing LLM Backdoor Triggers
by: Bullwinkel, Blake, et al.
Published: (2026) -
A Representation Engineering Perspective on the Effectiveness of Multi-Turn Jailbreaks
by: Bullwinkel, Blake, et al.
Published: (2025) -
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle
by: Haider, Emman, et al.
Published: (2024) -
GRP-Obliteration: Unaligning LLMs With a Single Unlabeled Prompt
by: Russinovich, Mark, et al.
Published: (2026)