:: Library Catalog

Imagen de Portada

Guardado en:

Detalles Bibliográficos
Autores principales:	Hoscilowicz, Jakub, Janicki, Artur
Formato:	Preprint
Publicado:	2025
Materias:	Computation and Language
Acceso en línea:	https://arxiv.org/abs/2511.20494
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Ejemplares similares

Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)

Large Language Models as Carriers of Hidden Messages
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)

Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?
por: Hościłowicz, Jakub, et al.
Publicado: (2023)

Steerability of Instrumental-Convergence Tendencies in LLMs
por: Hoscilowicz, Jakub
Publicado: (2026)

Non-Linear Inference Time Intervention: Improving LLM Truthfulness
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)

ClickAgent: Enhancing UI Location Capabilities of Autonomous Agents
por: Hoscilowicz, Jakub, et al.
Publicado: (2024)

TF-Attack: Transferable and Fast Adversarial Attacks on Large Language Models
por: Li, Zelin, et al.
Publicado: (2024)

Robustness of Large Language Models Against Adversarial Attacks
por: Tao, Yiyi, et al.
Publicado: (2024)

Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models
por: Nie, Ercong, et al.
Publicado: (2025)

Adversarial Evasion Attack Efficiency against Large Language Models
por: Vitorino, João, et al.
Publicado: (2024)

Vision Language Models are Confused Tourists
por: Irawan, Patrick Amadeus, et al.
Publicado: (2025)

Cats Confuse Reasoning LLM: Query Agnostic Adversarial Triggers for Reasoning Models
por: Rajeev, Meghana, et al.
Publicado: (2025)

Human-Interpretable Adversarial Prompt Attack on Large Language Models with Situational Context
por: Das, Nilanjana, et al.
Publicado: (2024)

The Resurgence of GCG Adversarial Attacks on Large Language Models
por: Tan, Yuting, et al.
Publicado: (2025)

Benchmarking Gaslighting Negation Attacks Against Multimodal Large Language Models
por: Zhu, Bin, et al.
Publicado: (2025)

Finding Challenging Metaphors that Confuse Pretrained Language Models
por: Li, Yucheng, et al.
Publicado: (2024)

Controlling Language Confusion in Multilingual LLMs
por: Lee, Nahyun, et al.
Publicado: (2025)

Understanding and Mitigating Language Confusion in LLMs
por: Marchisio, Kelly, et al.
Publicado: (2024)

Adversarial Attack on Large Language Models using Exponentiated Gradient Descent
por: Biswas, Sajib, et al.
Publicado: (2025)

Jailbreaking Attack against Multimodal Large Language Model
por: Niu, Zhenxing, et al.
Publicado: (2024)

Robust Fake News Detection using Large Language Models under Adversarial Sentiment Attacks
por: Tahmasebi, Sahar, et al.
Publicado: (2026)

TLPO: Token-Level Policy Optimization for Mitigating Language Confusion in Large Language Models
por: Choo, Jinho, et al.
Publicado: (2026)

Uncovering Entity Identity Confusion in Multimodal Knowledge Editing
por: Wu, Shu, et al.
Publicado: (2026)

Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks
por: Chen, Yiyi, et al.
Publicado: (2024)

Language Confusion Gate: Language-Aware Decoding Through Model Self-Distillation
por: Zhang, Collin, et al.
Publicado: (2025)

Seeing the Threat: Vulnerabilities in Vision-Language Models to Adversarial Attack
por: Ren, Juan, et al.
Publicado: (2025)

MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate
por: Amayuelas, Alfonso, et al.
Publicado: (2024)

Active Confusion Expression in Large Language Models: Leveraging World Models toward Better Social Reasoning
por: Du, Jialu, et al.
Publicado: (2025)

Model Merging to Maintain Language-Only Performance in Developmentally Plausible Multimodal Models
por: Takmaz, Ece, et al.
Publicado: (2025)

Dynamics of Adversarial Attacks on Large Language Model-Based Search Engines
por: Hu, Xiyang
Publicado: (2025)

SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models
por: Peri, Raghuveer, et al.
Publicado: (2024)

Test-Time Backdoor Attacks on Multimodal Large Language Models
por: Lu, Dong, et al.
Publicado: (2024)

Attacking Misinformation Detection Using Adversarial Examples Generated by Language Models
por: Przybyła, Piotr, et al.
Publicado: (2024)

Adversarial Attacks on Large Language Models Using Regularized Relaxation
por: Chacko, Samuel Jacob, et al.
Publicado: (2024)

Large Language Models for Biomedical Article Classification
por: Proboszcz, Jakub, et al.
Publicado: (2026)

Large Language Models in Legislative Content Analysis: A Dataset from the Polish Parliament
por: Bryłkowski, Arkadiusz, et al.
Publicado: (2025)

Large Language Models for Czech Aspect-Based Sentiment Analysis
por: Šmíd, Jakub, et al.
Publicado: (2025)

Time-To-Inconsistency: A Survival Analysis of Large Language Model Robustness to Adversarial Attacks
por: Li, Yubo, et al.
Publicado: (2025)

Imposter.AI: Adversarial Attacks with Hidden Intentions towards Aligned Large Language Models
por: Liu, Xiao, et al.
Publicado: (2024)

Revisiting Character-level Adversarial Attacks for Language Models
por: Rocamora, Elias Abad, et al.
Publicado: (2024)