:: Library Catalog

Image de couverture de livre

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Verma, Rakesh M., Dershowitz, Nachum, Zeng, Victor, Boumber, Dainis, Liu, Xuting
Format:	Preprint
Publié:	2024
Sujets:	Computation and Language Cryptography and Security Computers and Society
Accès en ligne:	https://arxiv.org/abs/2402.01019
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Documents similaires

A Roadmap for Multilingual, Multimodal Domain Independent Deception Detection
par: Boumber, Dainis, et autres
Publié: (2024)

The Pitfalls of Publishing in the Age of LLMs: Strange and Surprising Adventures with a High-Impact NLP Journal
par: Verma, Rakesh M., et autres
Publié: (2024)

Fake News, Disinformation, and Deepfakes: Leveraging Distributed Ledger Technologies and Blockchain to Combat Digital Deception and Counterfeit Reality
par: Fraga-Lamas, Paula, et autres
Publié: (2019)

Let's Measure the Elephant in the Room: Facilitating Personalized Automated Analysis of Privacy Policies at Scale
par: Zhao, Rui, et autres
Publié: (2025)

Homograph Attacks on Maghreb Sentiment Analyzers
par: Qachfar, Fatima Zahra, et autres
Publié: (2024)

Access Over Deception: Fighting Deceptive Patterns through Accessibility
par: Pellkvist, Tobias, et autres
Publié: (2026)

SecureForge: Finding and Preventing Vulnerabilities in LLM-Generated Code via Prompt Optimization
par: Liu, Houjun, et autres
Publié: (2026)

Privacy Computing Meets Metaverse: Necessity, Taxonomy and Challenges
par: Chen, Chuan, et autres
Publié: (2023)

Chameleon Channels: Measuring YouTube Accounts Repurposed for Deception and Profit
par: Cuevas, Alejandro, et autres
Publié: (2025)

Provably Secure Disambiguating Neural Linguistic Steganography
par: Qi, Yuang, et autres
Publié: (2024)

Honeyquest: Rapidly Measuring the Enticingness of Cyber Deception Techniques with Code-based Questionnaires
par: Kahlhofer, Mario, et autres
Publié: (2024)

An Investigation into Misuse of Java Security APIs by Large Language Models
par: Mousavi, Zahra, et autres
Publié: (2024)

Data Defenses Against Large Language Models
par: Agnew, William, et autres
Publié: (2024)

How Susceptible are Large Language Models to Ideological Manipulation?
par: Chen, Kai, et autres
Publié: (2024)

Get my drift? Catching LLM Task Drift with Activation Deltas
par: Abdelnabi, Sahar, et autres
Publié: (2024)

ConVerse: Benchmarking Contextual Safety in Agent-to-Agent Conversations
par: Gomaa, Amr, et autres
Publié: (2025)

Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control
par: Cyberey, Hannah, et autres
Publié: (2025)

AI Agents May Always Fall for Prompt Injections
par: Abdelnabi, Sahar, et autres
Publié: (2026)

A Content-Preserving Secure Linguistic Steganography
par: Xiang, Lingyun, et autres
Publié: (2025)

The Landscape of Prompt Injection Threats in LLM Agents: From Taxonomy to Analysis
par: Wang, Peiran, et autres
Publié: (2026)

RealHarm: A Collection of Real-World Language Model Application Failures
par: Jeune, Pierre Le, et autres
Publié: (2025)

Hardware-Level Governance of AI Compute: A Feasibility Taxonomy for Regulatory Compliance and Treaty Verification
par: Ansari, Samar
Publié: (2026)

Digital Deception: Generative Artificial Intelligence in Social Engineering and Phishing
par: Schmitt, Marc, et autres
Publié: (2023)

Beyond Context: Large Language Models' Failure to Grasp Users' Intent
par: Hussain, Ahmed M., et autres
Publié: (2025)

TombRaider: Entering the Vault of History to Jailbreak Large Language Models
par: Ding, Junchen, et autres
Publié: (2025)

Phare: A Safety Probe for Large Language Models
par: Jeune, Pierre Le, et autres
Publié: (2025)

AgentShield: Deception-based Compromise Detection for Tool-using LLM Agents
par: Rassul, Yassin H., et autres
Publié: (2026)

Medical Malice: A Dataset for Context-Aware Safety in Healthcare LLMs
par: D'addario, Andrew Maranhão Ventura
Publié: (2025)

A Survey on Responsible LLMs: Inherent Risk, Malicious Use, and Mitigation Strategy
par: Wang, Huandong, et autres
Publié: (2025)

k-SemStamp: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text
par: Hou, Abe Bohan, et autres
Publié: (2024)

SafeCOMM: A Study on Safety Degradation in Fine-Tuned Telecom Large Language Models
par: Djuhera, Aladin, et autres
Publié: (2025)

SimMark: A Robust Sentence-Level Similarity-Based Watermarking Algorithm for Large Language Models
par: Dabiriaghdam, Amirhossein, et autres
Publié: (2025)

The TCF doesn't really A(A)ID -- Automatic Privacy Analysis and Legal Compliance of TCF-based Android Applications
par: Morel, Victor, et autres
Publié: (2026)

Zero-shot Generative Linguistic Steganography
par: Lin, Ke, et autres
Publié: (2024)

Talk is (Not) Cheap: A Taxonomy and Benchmark Coverage Audit for LLM Attacks
par: Iyer, Karthik Raghu, et autres
Publié: (2026)

AuditGPT: Auditing Smart Contracts with ChatGPT
par: Xia, Shihao, et autres
Publié: (2024)

How Alignment and Jailbreak Work: Explain LLM Safety through Intermediate Hidden States
par: Zhou, Zhenhong, et autres
Publié: (2024)

Attacks on Third-Party APIs of Large Language Models
par: Zhao, Wanru, et autres
Publié: (2024)

BadFair: Backdoored Fairness Attacks with Group-conditioned Triggers
par: Xue, Jiaqi, et autres
Publié: (2024)

Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models
par: An, Bang, et autres
Publié: (2024)