Saved in:
| Main Authors: | Roth, Tom, Unanue, Inigo Jauregi, Abuadbba, Alsharif, Piccardi, Massimo |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.08255 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers
by: Roth, Tom, et al.
Published: (2024)
by: Roth, Tom, et al.
Published: (2024)
Pref-CTRL: Preference Driven LLM Alignment using Representation Editing
by: Ashrafi, Imranul, et al.
Published: (2026)
by: Ashrafi, Imranul, et al.
Published: (2026)
SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization
by: Parnell, Jacob, et al.
Published: (2024)
by: Parnell, Jacob, et al.
Published: (2024)
Alert-ME: An Explainability-Driven Defense Against Adversarial Examples in Transformer-Based Text Classification
by: Sabir, Bushra, et al.
Published: (2023)
by: Sabir, Bushra, et al.
Published: (2023)
Adversarial Attacks Against Automated Fact-Checking: A Survey
by: Liu, Fanzhen, et al.
Published: (2025)
by: Liu, Fanzhen, et al.
Published: (2025)
Token-Modification Adversarial Attacks for Natural Language Processing: A Survey
by: Roth, Tom, et al.
Published: (2021)
by: Roth, Tom, et al.
Published: (2021)
ViMedCSS: A Vietnamese Medical Code-Switching Speech Dataset & Benchmark
by: Nguyen, Tung X., et al.
Published: (2026)
by: Nguyen, Tung X., et al.
Published: (2026)
Modeling the Attack: Detecting AI-Generated Text by Quantifying Adversarial Perturbations
by: Teja, Lekkala Sai, et al.
Published: (2025)
by: Teja, Lekkala Sai, et al.
Published: (2025)
OpenFact at CheckThat! 2024: Combining Multiple Attack Methods for Effective Adversarial Text Generation
by: Lewoniewski, Włodzimierz, et al.
Published: (2024)
by: Lewoniewski, Włodzimierz, et al.
Published: (2024)
HQA-Attack: Toward High Quality Black-Box Hard-Label Adversarial Attack on Text
by: Liu, Han, et al.
Published: (2024)
by: Liu, Han, et al.
Published: (2024)
STACK: Adversarial Attacks on LLM Safeguard Pipelines
by: McKenzie, Ian R., et al.
Published: (2025)
by: McKenzie, Ian R., et al.
Published: (2025)
AnthroScore: A Computational Linguistic Measure of Anthropomorphism
by: Cheng, Myra, et al.
Published: (2024)
by: Cheng, Myra, et al.
Published: (2024)
Adversarial Attacks on Parts of Speech: An Empirical Study in Text-to-Image Generation
by: Shahariar, G M, et al.
Published: (2024)
by: Shahariar, G M, et al.
Published: (2024)
On Evaluating The Performance of Watermarked Machine-Generated Texts Under Adversarial Attacks
by: Liu, Zesen, et al.
Published: (2024)
by: Liu, Zesen, et al.
Published: (2024)
MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts
by: Macko, Dominik, et al.
Published: (2024)
by: Macko, Dominik, et al.
Published: (2024)
Toward Cross-Lingual Quality Classifiers for Multilingual Pretraining Data Selection
by: Turki, Yassine, et al.
Published: (2026)
by: Turki, Yassine, et al.
Published: (2026)
Increasing the Robustness of the Fine-tuned Multilingual Machine-Generated Text Detectors
by: Macko, Dominik, et al.
Published: (2025)
by: Macko, Dominik, et al.
Published: (2025)
SoS: Analysis of Surface over Semantics in Multilingual Text-To-Image Generation
by: Holtermann, Carolin, et al.
Published: (2026)
by: Holtermann, Carolin, et al.
Published: (2026)
Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings
by: Kadhim, Ahmed K., et al.
Published: (2025)
by: Kadhim, Ahmed K., et al.
Published: (2025)
Fast Adversarial Training against Textual Adversarial Attacks
by: Yang, Yichen, et al.
Published: (2024)
by: Yang, Yichen, et al.
Published: (2024)
Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
by: Huang, Guanhua, et al.
Published: (2024)
by: Huang, Guanhua, et al.
Published: (2024)
ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks
by: Yu, Xiaodong, et al.
Published: (2023)
by: Yu, Xiaodong, et al.
Published: (2023)
MULTITuDE: Large-Scale Multilingual Machine-Generated Text Detection Benchmark
by: Macko, Dominik, et al.
Published: (2023)
by: Macko, Dominik, et al.
Published: (2023)
A Comparative Analysis of Counterfactual Explanation Methods for Text Classifiers
by: McAleese, Stephen, et al.
Published: (2024)
by: McAleese, Stephen, et al.
Published: (2024)
Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification
by: Pecher, Branislav, et al.
Published: (2026)
by: Pecher, Branislav, et al.
Published: (2026)
The First Multilingual Model For The Detection of Suicide Texts
by: Zevallos, Rodolfo, et al.
Published: (2024)
by: Zevallos, Rodolfo, et al.
Published: (2024)
Multilingual and Explainable Text Detoxification with Parallel Corpora
by: Dementieva, Daryna, et al.
Published: (2024)
by: Dementieva, Daryna, et al.
Published: (2024)
CEAID: Benchmark of Multilingual Machine-Generated Text Detection Methods for Central European Languages
by: Macko, Dominik, et al.
Published: (2025)
by: Macko, Dominik, et al.
Published: (2025)
SEP-Attack: A Simple and Effective Paradigm for Transfer-Based Textual Adversarial Attack
by: Liu, Han, et al.
Published: (2026)
by: Liu, Han, et al.
Published: (2026)
COMMENTATOR: A Code-mixed Multilingual Text Annotation Framework
by: Sheth, Rajvee, et al.
Published: (2024)
by: Sheth, Rajvee, et al.
Published: (2024)
Layer-Wise Perturbations via Sparse Autoencoders for Adversarial Text Generation
by: Shu, Huizhen, et al.
Published: (2025)
by: Shu, Huizhen, et al.
Published: (2025)
Combating Adversarial Attacks with Multi-Agent Debate
by: Chern, Steffi, et al.
Published: (2024)
by: Chern, Steffi, et al.
Published: (2024)
Adversarial Attacks and Defense for Conversation Entailment Task
by: Yang, Zhenning, et al.
Published: (2024)
by: Yang, Zhenning, et al.
Published: (2024)
Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts
by: Zhang, Yifan, et al.
Published: (2024)
by: Zhang, Yifan, et al.
Published: (2024)
Toward a Safer Web: Multilingual Multi-Agent LLMs for Mitigating Adversarial Misinformation Attacks
by: Aldahoul, Nouar, et al.
Published: (2025)
by: Aldahoul, Nouar, et al.
Published: (2025)
Fine-tuning Large Language Models for Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection
by: Xiong, Feng, et al.
Published: (2024)
by: Xiong, Feng, et al.
Published: (2024)
Evaluation of Multilingual LLMs Personalized Text Generation Capabilities Targeting Groups and Social-Media Platforms
by: Macko, Dominik
Published: (2026)
by: Macko, Dominik
Published: (2026)
Multilingual Hidden Prompt Injection Attacks on LLM-Based Academic Reviewing
by: Theocharopoulos, Panagiotis, et al.
Published: (2025)
by: Theocharopoulos, Panagiotis, et al.
Published: (2025)
SmurfCat at PAN 2024 TextDetox: Alignment of Multilingual Transformers for Text Detoxification
by: Rykov, Elisei, et al.
Published: (2024)
by: Rykov, Elisei, et al.
Published: (2024)
Translation-Enhanced Multilingual Text-to-Image Generation
by: Li, Yaoyiran, et al.
Published: (2023)
by: Li, Yaoyiran, et al.
Published: (2023)
Similar Items
-
A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers
by: Roth, Tom, et al.
Published: (2024) -
Pref-CTRL: Preference Driven LLM Alignment using Representation Editing
by: Ashrafi, Imranul, et al.
Published: (2026) -
SumTra: A Differentiable Pipeline for Few-Shot Cross-Lingual Summarization
by: Parnell, Jacob, et al.
Published: (2024) -
Alert-ME: An Explainability-Driven Defense Against Adversarial Examples in Transformer-Based Text Classification
by: Sabir, Bushra, et al.
Published: (2023) -
Adversarial Attacks Against Automated Fact-Checking: A Survey
by: Liu, Fanzhen, et al.
Published: (2025)