Guardado en:
| Autores principales: | Hoscilowicz, Jakub, Janicki, Artur |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2511.20494 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)
Large Language Models as Carriers of Hidden Messages
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)
Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?
por: Hościłowicz, Jakub, et al.
Publicado: (2023)
por: Hościłowicz, Jakub, et al.
Publicado: (2023)
Steerability of Instrumental-Convergence Tendencies in LLMs
por: Hoscilowicz, Jakub
Publicado: (2026)
por: Hoscilowicz, Jakub
Publicado: (2026)
Non-Linear Inference Time Intervention: Improving LLM Truthfulness
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)
ClickAgent: Enhancing UI Location Capabilities of Autonomous Agents
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)
TF-Attack: Transferable and Fast Adversarial Attacks on Large Language Models
por: Li, Zelin, et al.
Publicado: (2024)
por: Li, Zelin, et al.
Publicado: (2024)
Robustness of Large Language Models Against Adversarial Attacks
por: Tao, Yiyi, et al.
Publicado: (2024)
por: Tao, Yiyi, et al.
Publicado: (2024)
Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models
por: Nie, Ercong, et al.
Publicado: (2025)
por: Nie, Ercong, et al.
Publicado: (2025)
Adversarial Evasion Attack Efficiency against Large Language Models
por: Vitorino, João, et al.
Publicado: (2024)
por: Vitorino, João, et al.
Publicado: (2024)
Vision Language Models are Confused Tourists
por: Irawan, Patrick Amadeus, et al.
Publicado: (2025)
por: Irawan, Patrick Amadeus, et al.
Publicado: (2025)
Cats Confuse Reasoning LLM: Query Agnostic Adversarial Triggers for Reasoning Models
por: Rajeev, Meghana, et al.
Publicado: (2025)
por: Rajeev, Meghana, et al.
Publicado: (2025)
Human-Interpretable Adversarial Prompt Attack on Large Language Models with Situational Context
por: Das, Nilanjana, et al.
Publicado: (2024)
por: Das, Nilanjana, et al.
Publicado: (2024)
The Resurgence of GCG Adversarial Attacks on Large Language Models
por: Tan, Yuting, et al.
Publicado: (2025)
por: Tan, Yuting, et al.
Publicado: (2025)
Benchmarking Gaslighting Negation Attacks Against Multimodal Large Language Models
por: Zhu, Bin, et al.
Publicado: (2025)
por: Zhu, Bin, et al.
Publicado: (2025)
Finding Challenging Metaphors that Confuse Pretrained Language Models
por: Li, Yucheng, et al.
Publicado: (2024)
por: Li, Yucheng, et al.
Publicado: (2024)
Controlling Language Confusion in Multilingual LLMs
por: Lee, Nahyun, et al.
Publicado: (2025)
por: Lee, Nahyun, et al.
Publicado: (2025)
Understanding and Mitigating Language Confusion in LLMs
por: Marchisio, Kelly, et al.
Publicado: (2024)
por: Marchisio, Kelly, et al.
Publicado: (2024)
Adversarial Attack on Large Language Models using Exponentiated Gradient Descent
por: Biswas, Sajib, et al.
Publicado: (2025)
por: Biswas, Sajib, et al.
Publicado: (2025)
Jailbreaking Attack against Multimodal Large Language Model
por: Niu, Zhenxing, et al.
Publicado: (2024)
por: Niu, Zhenxing, et al.
Publicado: (2024)
Robust Fake News Detection using Large Language Models under Adversarial Sentiment Attacks
por: Tahmasebi, Sahar, et al.
Publicado: (2026)
por: Tahmasebi, Sahar, et al.
Publicado: (2026)
TLPO: Token-Level Policy Optimization for Mitigating Language Confusion in Large Language Models
por: Choo, Jinho, et al.
Publicado: (2026)
por: Choo, Jinho, et al.
Publicado: (2026)
Uncovering Entity Identity Confusion in Multimodal Knowledge Editing
por: Wu, Shu, et al.
Publicado: (2026)
por: Wu, Shu, et al.
Publicado: (2026)
Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks
por: Chen, Yiyi, et al.
Publicado: (2024)
por: Chen, Yiyi, et al.
Publicado: (2024)
Language Confusion Gate: Language-Aware Decoding Through Model Self-Distillation
por: Zhang, Collin, et al.
Publicado: (2025)
por: Zhang, Collin, et al.
Publicado: (2025)
Seeing the Threat: Vulnerabilities in Vision-Language Models to Adversarial Attack
por: Ren, Juan, et al.
Publicado: (2025)
por: Ren, Juan, et al.
Publicado: (2025)
MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate
por: Amayuelas, Alfonso, et al.
Publicado: (2024)
por: Amayuelas, Alfonso, et al.
Publicado: (2024)
Active Confusion Expression in Large Language Models: Leveraging World Models toward Better Social Reasoning
por: Du, Jialu, et al.
Publicado: (2025)
por: Du, Jialu, et al.
Publicado: (2025)
Model Merging to Maintain Language-Only Performance in Developmentally Plausible Multimodal Models
por: Takmaz, Ece, et al.
Publicado: (2025)
por: Takmaz, Ece, et al.
Publicado: (2025)
Dynamics of Adversarial Attacks on Large Language Model-Based Search Engines
por: Hu, Xiyang
Publicado: (2025)
por: Hu, Xiyang
Publicado: (2025)
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models
por: Peri, Raghuveer, et al.
Publicado: (2024)
por: Peri, Raghuveer, et al.
Publicado: (2024)
Test-Time Backdoor Attacks on Multimodal Large Language Models
por: Lu, Dong, et al.
Publicado: (2024)
por: Lu, Dong, et al.
Publicado: (2024)
Attacking Misinformation Detection Using Adversarial Examples Generated by Language Models
por: Przybyła, Piotr, et al.
Publicado: (2024)
por: Przybyła, Piotr, et al.
Publicado: (2024)
Adversarial Attacks on Large Language Models Using Regularized Relaxation
por: Chacko, Samuel Jacob, et al.
Publicado: (2024)
por: Chacko, Samuel Jacob, et al.
Publicado: (2024)
Large Language Models for Biomedical Article Classification
por: Proboszcz, Jakub, et al.
Publicado: (2026)
por: Proboszcz, Jakub, et al.
Publicado: (2026)
Large Language Models in Legislative Content Analysis: A Dataset from the Polish Parliament
por: Bryłkowski, Arkadiusz, et al.
Publicado: (2025)
por: Bryłkowski, Arkadiusz, et al.
Publicado: (2025)
Large Language Models for Czech Aspect-Based Sentiment Analysis
por: Šmíd, Jakub, et al.
Publicado: (2025)
por: Šmíd, Jakub, et al.
Publicado: (2025)
Time-To-Inconsistency: A Survival Analysis of Large Language Model Robustness to Adversarial Attacks
por: Li, Yubo, et al.
Publicado: (2025)
por: Li, Yubo, et al.
Publicado: (2025)
Imposter.AI: Adversarial Attacks with Hidden Intentions towards Aligned Large Language Models
por: Liu, Xiao, et al.
Publicado: (2024)
por: Liu, Xiao, et al.
Publicado: (2024)
Revisiting Character-level Adversarial Attacks for Language Models
por: Rocamora, Elias Abad, et al.
Publicado: (2024)
por: Rocamora, Elias Abad, et al.
Publicado: (2024)
Ejemplares similares
-
Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages
por: Hoscilowicz, Jakub, et al.
Publicado: (2024) -
Large Language Models as Carriers of Hidden Messages
por: Hoscilowicz, Jakub, et al.
Publicado: (2024) -
Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?
por: Hościłowicz, Jakub, et al.
Publicado: (2023) -
Steerability of Instrumental-Convergence Tendencies in LLMs
por: Hoscilowicz, Jakub
Publicado: (2026) -
Non-Linear Inference Time Intervention: Improving LLM Truthfulness
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)