Saved in:
| Main Authors: | Hoscilowicz, Jakub, Popiolek, Pawel, Rudkowski, Jan, Bieniasz, Jedrzej, Janicki, Artur |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.02481 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Adversarial Confusion Attack: Disrupting Multimodal Large Language Models
by: Hoscilowicz, Jakub, et al.
Published: (2025)
by: Hoscilowicz, Jakub, et al.
Published: (2025)
Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages
by: Hoscilowicz, Jakub, et al.
Published: (2024)
by: Hoscilowicz, Jakub, et al.
Published: (2024)
Is the System Message Really Important to Jailbreaks in Large Language Models?
by: Zou, Xiaotian, et al.
Published: (2024)
by: Zou, Xiaotian, et al.
Published: (2024)
Can Reinforcement Learning Unlock the Hidden Dangers in Aligned Large Language Models?
by: Karkevandi, Mohammad Bahrami, et al.
Published: (2024)
by: Karkevandi, Mohammad Bahrami, et al.
Published: (2024)
Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?
by: Hościłowicz, Jakub, et al.
Published: (2023)
by: Hościłowicz, Jakub, et al.
Published: (2023)
Imposter.AI: Adversarial Attacks with Hidden Intentions towards Aligned Large Language Models
by: Liu, Xiao, et al.
Published: (2024)
by: Liu, Xiao, et al.
Published: (2024)
Copyright Traps for Large Language Models
by: Meeus, Matthieu, et al.
Published: (2024)
by: Meeus, Matthieu, et al.
Published: (2024)
Yet Another Watermark for Large Language Models
by: Bao, Siyuan, et al.
Published: (2025)
by: Bao, Siyuan, et al.
Published: (2025)
Token-Level Privacy in Large Language Models
by: Harel, Re'em, et al.
Published: (2025)
by: Harel, Re'em, et al.
Published: (2025)
Lateral Phishing With Large Language Models: A Large Organization Comparative Study
by: Bethany, Mazal, et al.
Published: (2024)
by: Bethany, Mazal, et al.
Published: (2024)
Privacy-Preserving Instructions for Aligning Large Language Models
by: Yu, Da, et al.
Published: (2024)
by: Yu, Da, et al.
Published: (2024)
Prompt Stealing Attacks Against Large Language Models
by: Sha, Zeyang, et al.
Published: (2024)
by: Sha, Zeyang, et al.
Published: (2024)
Majority Bit-Aware Watermarking For Large Language Models
by: Xu, Jiahao, et al.
Published: (2025)
by: Xu, Jiahao, et al.
Published: (2025)
Hidden Data Privacy Breaches in Federated Learning
by: Gong, Xueluan, et al.
Published: (2024)
by: Gong, Xueluan, et al.
Published: (2024)
Hidden Ads: Behavior Triggered Semantic Backdoors for Advertisement Injection in Vision Language Models
by: Yao, Duanyi, et al.
Published: (2026)
by: Yao, Duanyi, et al.
Published: (2026)
garak: A Framework for Security Probing Large Language Models
by: Derczynski, Leon, et al.
Published: (2024)
by: Derczynski, Leon, et al.
Published: (2024)
Simulate and Eliminate: Revoke Backdoors for Generative Large Language Models
by: Li, Haoran, et al.
Published: (2024)
by: Li, Haoran, et al.
Published: (2024)
Multi-Agent Collaboration in Incident Response with Large Language Models
by: Liu, Zefang
Published: (2024)
by: Liu, Zefang
Published: (2024)
Denial-of-Service Poisoning Attacks against Large Language Models
by: Gao, Kuofeng, et al.
Published: (2024)
by: Gao, Kuofeng, et al.
Published: (2024)
Watermarking Large Language Models and the Generated Content: Opportunities and Challenges
by: Zhang, Ruisi, et al.
Published: (2024)
by: Zhang, Ruisi, et al.
Published: (2024)
Privacy in Large Language Models: Attacks, Defenses and Future Directions
by: Li, Haoran, et al.
Published: (2023)
by: Li, Haoran, et al.
Published: (2023)
Resource Consumption Red-Teaming for Large Vision-Language Models
by: Gao, Haoran, et al.
Published: (2025)
by: Gao, Haoran, et al.
Published: (2025)
EPT Benchmark: Evaluation of Persian Trustworthiness in Large Language Models
by: Mirbagheri, Mohammad Reza, et al.
Published: (2025)
by: Mirbagheri, Mohammad Reza, et al.
Published: (2025)
InfoFlood: Jailbreaking Large Language Models with Information Overload
by: Yadav, Advait, et al.
Published: (2025)
by: Yadav, Advait, et al.
Published: (2025)
On the Hidden Costs of Counterfactual Knowledge Training in LLM Unlearning
by: Ye, Xiaotian, et al.
Published: (2026)
by: Ye, Xiaotian, et al.
Published: (2026)
ChatNVD: Advancing Cybersecurity Vulnerability Assessment with Large Language Models
by: Chopra, Shivansh, et al.
Published: (2024)
by: Chopra, Shivansh, et al.
Published: (2024)
Harnessing Task Overload for Scalable Jailbreak Attacks on Large Language Models
by: Dong, Yiting, et al.
Published: (2024)
by: Dong, Yiting, et al.
Published: (2024)
S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models
by: Yuan, Xiaohan, et al.
Published: (2024)
by: Yuan, Xiaohan, et al.
Published: (2024)
Large Language Models are Good Attackers: Efficient and Stealthy Textual Backdoor Attacks
by: Li, Ziqiang, et al.
Published: (2024)
by: Li, Ziqiang, et al.
Published: (2024)
Breaking Down the Defenses: A Comparative Survey of Attacks on Large Language Models
by: Chowdhury, Arijit Ghosh, et al.
Published: (2024)
by: Chowdhury, Arijit Ghosh, et al.
Published: (2024)
Don't Listen To Me: Understanding and Exploring Jailbreak Prompts of Large Language Models
by: Yu, Zhiyuan, et al.
Published: (2024)
by: Yu, Zhiyuan, et al.
Published: (2024)
PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails
by: Mangaokar, Neal, et al.
Published: (2024)
by: Mangaokar, Neal, et al.
Published: (2024)
ConfGuard: A Simple and Effective Backdoor Detection for Large Language Models
by: Wang, Zihan, et al.
Published: (2025)
by: Wang, Zihan, et al.
Published: (2025)
EvoDefense: Co-Evolving Black-Box Defense with Large Language Models
by: Li, Yu, et al.
Published: (2026)
by: Li, Yu, et al.
Published: (2026)
Privacy-Preserving Parameter-Efficient Fine-Tuning for Large Language Model Services
by: Li, Yansong, et al.
Published: (2023)
by: Li, Yansong, et al.
Published: (2023)
Understanding and Mitigating Over-refusal for Large Language Models via Safety Representation
by: Zhang, Junbo, et al.
Published: (2025)
by: Zhang, Junbo, et al.
Published: (2025)
Retrieval-Augmented Defense: Adaptive and Controllable Jailbreak Prevention for Large Language Models
by: Yang, Guangyu, et al.
Published: (2025)
by: Yang, Guangyu, et al.
Published: (2025)
Building Resilient SMEs: Harnessing Large Language Models for Cyber Security in Australia
by: Kereopa-Yorke, Benjamin
Published: (2023)
by: Kereopa-Yorke, Benjamin
Published: (2023)
EmMark: Robust Watermarks for IP Protection of Embedded Quantized Large Language Models
by: Zhang, Ruisi, et al.
Published: (2024)
by: Zhang, Ruisi, et al.
Published: (2024)
Safely Learning with Private Data: A Federated Learning Framework for Large Language Model
by: Zheng, JiaYing, et al.
Published: (2024)
by: Zheng, JiaYing, et al.
Published: (2024)
Similar Items
-
Adversarial Confusion Attack: Disrupting Multimodal Large Language Models
by: Hoscilowicz, Jakub, et al.
Published: (2025) -
Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages
by: Hoscilowicz, Jakub, et al.
Published: (2024) -
Is the System Message Really Important to Jailbreaks in Large Language Models?
by: Zou, Xiaotian, et al.
Published: (2024) -
Can Reinforcement Learning Unlock the Hidden Dangers in Aligned Large Language Models?
by: Karkevandi, Mohammad Bahrami, et al.
Published: (2024) -
Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?
by: Hościłowicz, Jakub, et al.
Published: (2023)