:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Roth, Tom, Unanue, Inigo Jauregi, Abuadbba, Alsharif, Piccardi, Massimo
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2401.08255
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers
by: Roth, Tom, et al.
Published: (2024)

Pref-CTRL: Preference Driven LLM Alignment using Representation Editing
by: Ashrafi, Imranul, et al.
Published: (2026)

SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization
by: Parnell, Jacob, et al.
Published: (2024)

Alert-ME: An Explainability-Driven Defense Against Adversarial Examples in Transformer-Based Text Classification
by: Sabir, Bushra, et al.
Published: (2023)

Adversarial Attacks Against Automated Fact-Checking: A Survey
by: Liu, Fanzhen, et al.
Published: (2025)

Token-Modification Adversarial Attacks for Natural Language Processing: A Survey
by: Roth, Tom, et al.
Published: (2021)

ViMedCSS: A Vietnamese Medical Code-Switching Speech Dataset & Benchmark
by: Nguyen, Tung X., et al.
Published: (2026)

Modeling the Attack: Detecting AI-Generated Text by Quantifying Adversarial Perturbations
by: Teja, Lekkala Sai, et al.
Published: (2025)

OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for Effective Adversarial Text Generation
by: Lewoniewski, Włodzimierz, et al.
Published: (2024)

HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text
by: Liu, Han, et al.
Published: (2024)

STACK: Adversarial Attacks on LLM Safeguard Pipelines
by: McKenzie, Ian R., et al.
Published: (2025)

AnthroScore: A Computational Linguistic Measure of Anthropomorphism
by: Cheng, Myra, et al.
Published: (2024)

Adversarial Attacks on Parts of Speech: An Empirical Study in Text-to-Image Generation
by: Shahariar, G M, et al.
Published: (2024)

On Evaluating The Performance of Watermarked Machine-Generated Texts Under Adversarial Attacks
by: Liu, Zesen, et al.
Published: (2024)

MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts
by: Macko, Dominik, et al.
Published: (2024)

Toward Cross-Lingual Quality Classifiers for Multilingual Pretraining Data Selection
by: Turki, Yassine, et al.
Published: (2026)

Increasing the Robustness of the Fine-tuned Multilingual Machine-Generated Text Detectors
by: Macko, Dominik, et al.
Published: (2025)

SoS: Analysis of Surface over Semantics in Multilingual Text-To-Image Generation
by: Holtermann, Carolin, et al.
Published: (2026)

Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings
by: Kadhim, Ahmed K., et al.
Published: (2025)

Fast Adversarial Training against Textual Adversarial Attacks
by: Yang, Yichen, et al.
Published: (2024)

Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
by: Huang, Guanhua, et al.
Published: (2024)

ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks
by: Yu, Xiaodong, et al.
Published: (2023)

MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark
by: Macko, Dominik, et al.
Published: (2023)

A Comparative Analysis of Counterfactual Explanation Methods for Text Classifiers
by: McAleese, Stephen, et al.
Published: (2024)

Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification
by: Pecher, Branislav, et al.
Published: (2026)

The First Multilingual Model For The Detection of Suicide Texts
by: Zevallos, Rodolfo, et al.
Published: (2024)

Multilingual and Explainable Text Detoxification with Parallel Corpora
by: Dementieva, Daryna, et al.
Published: (2024)

CEAID: Benchmark of Multilingual Machine-Generated Text Detection Methods for Central European Languages
by: Macko, Dominik, et al.
Published: (2025)

SEP-Attack: A Simple and Effective Paradigm for Transfer-Based Textual Adversarial Attack
by: Liu, Han, et al.
Published: (2026)

COMMENTATOR: A Code-mixed Multilingual Text Annotation Framework
by: Sheth, Rajvee, et al.
Published: (2024)

Layer-Wise Perturbations via Sparse Autoencoders for Adversarial Text Generation
by: Shu, Huizhen, et al.
Published: (2025)

Combating Adversarial Attacks with Multi-Agent Debate
by: Chern, Steffi, et al.
Published: (2024)

Adversarial Attacks and Defense for Conversation Entailment Task
by: Yang, Zhenning, et al.
Published: (2024)

Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts
by: Zhang, Yifan, et al.
Published: (2024)

Toward a Safer Web: Multilingual Multi-Agent LLMs for Mitigating Adversarial Misinformation Attacks
by: Aldahoul, Nouar, et al.
Published: (2025)

Fine-tuning Large Language Models for Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection
by: Xiong, Feng, et al.
Published: (2024)

Evaluation of Multilingual LLMs Personalized Text Generation Capabilities Targeting Groups and Social-Media Platforms
by: Macko, Dominik
Published: (2026)

Multilingual Hidden Prompt Injection Attacks on LLM-Based Academic Reviewing
by: Theocharopoulos, Panagiotis, et al.
Published: (2025)

SmurfCat at PAN 2024 TextDetox: Alignment of Multilingual Transformers for Text Detoxification
by: Rykov, Elisei, et al.
Published: (2024)

Translation-Enhanced Multilingual Text-to-Image Generation
by: Li, Yaoyiran, et al.
Published: (2023)