Saved in:
| Main Authors: | Nuriyev, Amir, Kulp, Gabriel |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.04105 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MetaMoE: Diversity-Aware Proxy Selection for Privacy-Preserving Mixture-of-Experts Unification
by: Jiang, Weisen, et al.
Published: (2026)
by: Jiang, Weisen, et al.
Published: (2026)
How Much Do Large Language Model Cheat on Evaluation? Benchmarking Overestimation under the One-Time-Pad-Based Framework
by: Liang, Zi, et al.
Published: (2025)
by: Liang, Zi, et al.
Published: (2025)
Text2VLM: Adapting Text-Only Datasets to Evaluate Alignment Training in Visual Language Models
by: Downer, Gabriel, et al.
Published: (2025)
by: Downer, Gabriel, et al.
Published: (2025)
LLM Jailbreak Detection for (Almost) Free!
by: Chen, Guorui, et al.
Published: (2025)
by: Chen, Guorui, et al.
Published: (2025)
RouteScan: A Non-Intrusive Approach to Auditing MoE LLMs Safety via Expert Routing Telemetry
by: Lv, Bo, et al.
Published: (2026)
by: Lv, Bo, et al.
Published: (2026)
SecMoE: Communication-Efficient Secure MoE Inference via Select-Then-Compute
by: Shen, Bowen, et al.
Published: (2026)
by: Shen, Bowen, et al.
Published: (2026)
Monitoring the Internal Monologue: Probe Trajectories Reveal Reasoning Dynamics
by: Chrabąszcz, Maciej, et al.
Published: (2026)
by: Chrabąszcz, Maciej, et al.
Published: (2026)
Toward Cybersecurity-Expert Small Language Models
by: Levi, Matan, et al.
Published: (2025)
by: Levi, Matan, et al.
Published: (2025)
GenBreak: Red Teaming Text-to-Image Generators Using Large Language Models
by: Wang, Zilong, et al.
Published: (2025)
by: Wang, Zilong, et al.
Published: (2025)
Attacks against Abstractive Text Summarization Models through Lead Bias and Influence Functions
by: Thota, Poojitha, et al.
Published: (2024)
by: Thota, Poojitha, et al.
Published: (2024)
AEIOU: A Unified Defense Framework against NSFW Prompts in Text-to-Image Models
by: Wang, Yiming, et al.
Published: (2024)
by: Wang, Yiming, et al.
Published: (2024)
EPT Benchmark: Evaluation of Persian Trustworthiness in Large Language Models
by: Mirbagheri, Mohammad Reza, et al.
Published: (2025)
by: Mirbagheri, Mohammad Reza, et al.
Published: (2025)
Bileve: Securing Text Provenance in Large Language Models Against Spoofing with Bi-level Signature
by: Zhou, Tong, et al.
Published: (2024)
by: Zhou, Tong, et al.
Published: (2024)
StructuralSleight: Automated Jailbreak Attacks on Large Language Models Utilizing Uncommon Text-Organization Structures
by: Li, Bangxin, et al.
Published: (2024)
by: Li, Bangxin, et al.
Published: (2024)
Data Extraction Attacks in Retrieval-Augmented Generation via Backdoors
by: Peng, Yuefeng, et al.
Published: (2024)
by: Peng, Yuefeng, et al.
Published: (2024)
Real-time and Zero-footprint Bag of Synthetic Syllables Algorithm for E-mail Spam Detection Using Subject Line and Short Text Fields
by: Selitskiy, Stanislav
Published: (2025)
by: Selitskiy, Stanislav
Published: (2025)
Beyond Jailbreaks: Revealing Stealthier and Broader LLM Security Risks Stemming from Alignment Failures
by: Zhou, Yukai, et al.
Published: (2025)
by: Zhou, Yukai, et al.
Published: (2025)
Adversarial Text Generation with Dynamic Contextual Perturbation
by: Waghela, Hetvi, et al.
Published: (2025)
by: Waghela, Hetvi, et al.
Published: (2025)
SecureGate: Learning When to Reveal PII Safely via Token-Gated Dual-Adapters for Federated LLMs
by: Shaaban, Mohamed, et al.
Published: (2026)
by: Shaaban, Mohamed, et al.
Published: (2026)
Revealing Weaknesses in Text Watermarking Through Self-Information Rewrite Attacks
by: Cheng, Yixin, et al.
Published: (2025)
by: Cheng, Yixin, et al.
Published: (2025)
Robustness Assessment and Enhancement of Text Watermarking for Google's SynthID
by: Han, Xia, et al.
Published: (2025)
by: Han, Xia, et al.
Published: (2025)
Traffic-MoE: A Sparse Foundation Model for Network Traffic Analysis
by: Zhou, Jiajun, et al.
Published: (2026)
by: Zhou, Jiajun, et al.
Published: (2026)
MGTEVAL: An Interactive Platform for Systemtic Evaluation of Machine-Generated Text Detectors
by: Li, Yuanfan, et al.
Published: (2026)
by: Li, Yuanfan, et al.
Published: (2026)
Block-wise Codeword Embedding for Reliable Multi-bit Text Watermarking
by: Kim, Joeun, et al.
Published: (2026)
by: Kim, Joeun, et al.
Published: (2026)
DP-BART for Privatized Text Rewriting under Local Differential Privacy
by: Igamberdiev, Timour, et al.
Published: (2023)
by: Igamberdiev, Timour, et al.
Published: (2023)
Factuality Beyond Coherence: Evaluating LLM Watermarking Methods for Medical Texts
by: Hastuti, Rochana Prih, et al.
Published: (2025)
by: Hastuti, Rochana Prih, et al.
Published: (2025)
HLPD: Aligning LLMs to Human Language Preference for Machine-Revised Text Detection
by: Dai, Fangqi, et al.
Published: (2025)
by: Dai, Fangqi, et al.
Published: (2025)
Beyond Text: Unveiling Privacy Vulnerabilities in Multi-modal Retrieval-Augmented Generation
by: Zhang, Jiankun, et al.
Published: (2025)
by: Zhang, Jiankun, et al.
Published: (2025)
MaskSQL: Safeguarding Privacy for LLM-Based Text-to-SQL via Abstraction
by: Abedini, Sepideh, et al.
Published: (2025)
by: Abedini, Sepideh, et al.
Published: (2025)
TSCheater: Generating High-Quality Tibetan Adversarial Texts via Visual Similarity
by: Cao, Xi, et al.
Published: (2024)
by: Cao, Xi, et al.
Published: (2024)
SecurityLingua: Efficient Defense of LLM Jailbreak Attacks via Security-Aware Prompt Compression
by: Li, Yucheng, et al.
Published: (2025)
by: Li, Yucheng, et al.
Published: (2025)
BinarySelect to Improve Accessibility of Black-Box Attack Research
by: Ghosh, Shatarupa, et al.
Published: (2024)
by: Ghosh, Shatarupa, et al.
Published: (2024)
GradEscape: A Gradient-Based Evader Against AI-Generated Text Detectors
by: Meng, Wenlong, et al.
Published: (2025)
by: Meng, Wenlong, et al.
Published: (2025)
Is the Digital Forensics and Incident Response Pipeline Ready for Text-Based Threats in LLM Era?
by: Bhandarkar, Avanti, et al.
Published: (2024)
by: Bhandarkar, Avanti, et al.
Published: (2024)
Efficiently and Effectively: A Two-stage Approach to Balance Plaintext and Encrypted Text for Traffic Classification
by: Peng, Wei, et al.
Published: (2024)
by: Peng, Wei, et al.
Published: (2024)
Iron Sharpens Iron: Defending Against Attacks in Machine-Generated Text Detection with Adversarial Training
by: Li, Yuanfan, et al.
Published: (2025)
by: Li, Yuanfan, et al.
Published: (2025)
Text Embedding Inversion Security for Multilingual Language Models
by: Chen, Yiyi, et al.
Published: (2024)
by: Chen, Yiyi, et al.
Published: (2024)
Fight Poison with Poison: Enhancing Robustness in Few-shot Machine-Generated Text Detection with Adversarial Training
by: Duan, Wenjing, et al.
Published: (2026)
by: Duan, Wenjing, et al.
Published: (2026)
Beyond Theoretical Bounds: Empirical Privacy Loss Calibration for Text Rewriting Under Local Differential Privacy
by: Li, Weijun, et al.
Published: (2026)
by: Li, Weijun, et al.
Published: (2026)
A Character-based Diffusion Embedding Algorithm for Enhancing the Generation Quality of Generative Linguistic Steganographic Texts
by: Chen, Yingquan, et al.
Published: (2025)
by: Chen, Yingquan, et al.
Published: (2025)
Similar Items
-
MetaMoE: Diversity-Aware Proxy Selection for Privacy-Preserving Mixture-of-Experts Unification
by: Jiang, Weisen, et al.
Published: (2026) -
How Much Do Large Language Model Cheat on Evaluation? Benchmarking Overestimation under the One-Time-Pad-Based Framework
by: Liang, Zi, et al.
Published: (2025) -
Text2VLM: Adapting Text-Only Datasets to Evaluate Alignment Training in Visual Language Models
by: Downer, Gabriel, et al.
Published: (2025) -
LLM Jailbreak Detection for (Almost) Free!
by: Chen, Guorui, et al.
Published: (2025) -
RouteScan: A Non-Intrusive Approach to Auditing MoE LLMs Safety via Expert Routing Telemetry
by: Lv, Bo, et al.
Published: (2026)